zookeeperProgrammers.html 55.9 KB
Newer Older
1
2
3
4
5
6
7
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-skin-name" content="pelt">
8
<title>ZooKeeper Programmer's Guide</title>
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
<link type="text/css" href="skin/basic.css" rel="stylesheet">
<link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
<link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
<link type="text/css" href="skin/profile.css" rel="stylesheet">
<script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script>
<link rel="shortcut icon" href="images/favicon.ico">
</head>
<body onload="init()">
<script type="text/javascript">ndeSetTextSize();</script>
<div id="top">
<!--+
    |breadtrail
    +-->
<div class="breadtrail">
<a href="http://www.apache.org/">Apache</a> &gt; <a href="http://hadoop.apache.org/">Hadoop</a> &gt; <a href="http://hadoop.apache.org/zookeeper/">ZooKeeper</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
</div>
<!--+
    |header
    +-->
<div class="header">
<!--+
    |start group logo
    +-->
<div class="grouplogo">
<a href="http://hadoop.apache.org/"><img class="logoImage" alt="Hadoop" src="images/hadoop-logo.jpg" title="Apache Hadoop"></a>
</div>
<!--+
    |end group logo
    +-->
<!--+
    |start Project Logo
    +-->
<div class="projectlogo">
<a href="http://hadoop.apache.org/zookeeper/"><img class="logoImage" alt="ZooKeeper" src="images/zookeeper_small.gif" title="The Hadoop database"></a>
</div>
<!--+
    |end Project Logo
    +-->
<!--+
    |start Search
    +-->
<div class="searchbox">
<form action="http://www.google.com/search" method="get" class="roundtopsmall">
<input value="hadoop.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">&nbsp; 
                    <input name="Search" value="Search" type="submit">
</form>
</div>
<!--+
    |end search
    +-->
<!--+
    |start Tabs
    +-->
<ul id="tabs">
<li>
<a class="unselected" href="http://hadoop.apache.org/zookeeper/">Project</a>
</li>
<li>
<a class="unselected" href="http://wiki.apache.org/hadoop/ZooKeeper">Wiki</a>
</li>
<li class="current">
<a class="selected" href="index.html">ZooKeeper Documentation</a>
</li>
</ul>
<!--+
    |end Tabs
    +-->
</div>
</div>
<div id="main">
<div id="publishedStrip">
<!--+
    |start Subtabs
    +-->
<div id="level2tabs"></div>
<!--+
    |end Endtabs
    +-->
<script type="text/javascript"><!--
document.write("Last Published: " + document.lastModified);
//  --></script>
</div>
<!--+
    |breadtrail
    +-->
<div class="breadtrail">

             &nbsp;
           </div>
<!--+
    |start Menu, mainarea
    +-->
<!--+
    |start Menu
    +-->
<div id="menu">
105
106
<div onclick="SwitchMenu('menu_1.1', 'skin/')" id="menu_1.1Title" class="menutitle">Overview</div>
<div id="menu_1.1" class="menuitemgroup">
107
108
109
110
<div class="menuitem">
<a href="index.html">Welcome</a>
</div>
<div class="menuitem">
111
<a href="zookeeperOver.html">Overview</a>
112
113
114
115
</div>
<div class="menuitem">
<a href="zookeeperStarted.html">Getting Started</a>
</div>
116
117
118
<div class="menuitem">
<a href="releasenotes.html">Release Notes</a>
</div>
119
120
121
122
123
124
</div>
<div onclick="SwitchMenu('menu_selected_1.2', 'skin/')" id="menu_selected_1.2Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Developer</div>
<div id="menu_selected_1.2" class="selectedmenuitemgroup" style="display: block;">
<div class="menuitem">
<a href="api/index.html">API Docs</a>
</div>
125
126
127
128
<div class="menupage">
<div class="menupagetitle">Programmer's Guide</div>
</div>
<div class="menuitem">
129
<a href="javaExample.html">Java Example</a>
130
131
</div>
<div class="menuitem">
132
<a href="zookeeperTutorial.html">Barrier and Queue Tutorial</a>
133
134
</div>
<div class="menuitem">
135
<a href="recipes.html">Recipes</a>
136
</div>
137
138
139
</div>
<div onclick="SwitchMenu('menu_1.3', 'skin/')" id="menu_1.3Title" class="menutitle">Admin &amp; Ops</div>
<div id="menu_1.3" class="menuitemgroup">
140
<div class="menuitem">
141
<a href="zookeeperAdmin.html">Administrator's Guide</a>
142
</div>
143
144
145
</div>
<div onclick="SwitchMenu('menu_1.4', 'skin/')" id="menu_1.4Title" class="menutitle">Contributor</div>
<div id="menu_1.4" class="menuitemgroup">
146
147
148
<div class="menuitem">
<a href="zookeeperInternals.html">ZooKeeper Internals</a>
</div>
149
</div>
150
151
<div onclick="SwitchMenu('menu_1.5', 'skin/')" id="menu_1.5Title" class="menutitle">Informal Documentation</div>
<div id="menu_1.5" class="menuitemgroup">
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
<div class="menuitem">
<a href="http://wiki.apache.org/hadoop/ZooKeeper">Wiki</a>
</div>
<div class="menuitem">
<a href="http://wiki.apache.org/hadoop/ZooKeeper/FAQ">FAQ</a>
</div>
<div class="menuitem">
<a href="http://hadoop.apache.org/zookeeper/mailing_lists.html">Mailing Lists</a>
</div>
<div class="menuitem">
<a href="zookeeperOtherInfo.html">Other Info</a>
</div>
</div>
<div id="credit"></div>
<div id="roundbottom">
<img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
<!--+
  |alternative credits
  +-->
<div id="credit2"></div>
</div>
<!--+
    |end Menu
    +-->
<!--+
    |start content
    +-->
<div id="content">
<div title="Portable Document Format" class="pdflink">
<a class="dida" href="zookeeperProgrammers.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br>
        PDF</a>
</div>
184
185
<h1>ZooKeeper Programmer's Guide</h1>
<h3>Developing Distributed Applications that use ZooKeeper</h3>
186
187
188
<div id="minitoc-area">
<ul class="minitoc">
<li>
189
190
191
192
<a href="#_introduction">Introduction</a>
</li>
<li>
<a href="#ch_zkDataModel">The ZooKeeper Data Model</a>
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
<ul class="minitoc">
<li>
<a href="#sc_zkDataModel_znodes">ZNodes</a>
<ul class="minitoc">
<li>
<a href="#sc_zkDataMode_watches">Watches</a>
</li>
<li>
<a href="#Data+Access">Data Access</a>
</li>
<li>
<a href="#Ephemeral+Nodes">Ephemeral Nodes</a>
</li>
<li>
<a href="#Unique+Naming">Unique Naming</a>
</li>
</ul>
</li>
<li>
<a href="#sc_timeInZk">Time in ZooKeeper</a>
</li>
<li>
<a href="#sc_zkStatStructure">ZooKeeper Stat Structure</a>
</li>
</ul>
</li>
<li>
220
<a href="#ch_zkSessions">ZooKeeper Sessions</a>
221
222
</li>
<li>
223
<a href="#ch_zkWatches">ZooKeeper Watches</a>
224
225
226
227
228
229
230
231
232
233
<ul class="minitoc">
<li>
<a href="#sc_WatchGuarantees">What ZooKeeper Guarantees about Watches</a>
</li>
<li>
<a href="#sc_WatchRememberThese">Things to Remember about Watches</a>
</li>
</ul>
</li>
<li>
234
235
236
237
238
239
240
<a href="#sc_ZooKeeperAccessControl">ZooKeeper access control using ACLs</a>
<ul class="minitoc">
<li>
<a href="#sc_ACLPermissions">ACL Permissions</a>
<ul class="minitoc">
<li>
<a href="#sc_BuiltinACLSchemes">Builtin ACL Schemes</a>
241
242
</li>
<li>
243
244
245
246
247
248
249
250
251
252
253
<a href="#Zookeeper+C+client+API">Zookeeper C client API</a>
</li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#ch_zkGuarantees">Consistency Guarantees</a>
</li>
<li>
<a href="#ch_bindings">Bindings</a>
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
<ul class="minitoc">
<li>
<a href="#Java+Binding">Java Binding</a>
</li>
<li>
<a href="#C+Binding">C Binding</a>
<ul class="minitoc">
<li>
<a href="#Installation">Installation</a>
</li>
<li>
<a href="#Using+the+Client">Using the Client</a>
</li>
</ul>
</li>
</ul>
</li>
<li>
272
<a href="#ch_guideToZkOperations">Building Blocks: A Guide to ZooKeeper Operations</a>
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
<ul class="minitoc">
<li>
<a href="#sc_connectingToZk">Connecting to ZooKeeper</a>
</li>
<li>
<a href="#sc_readOps">Read Operations</a>
</li>
<li>
<a href="#sc_writeOps">Write Operations</a>
</li>
<li>
<a href="#sc_handlingWatches">Handling Watches</a>
</li>
<li>
<a href="#sc_miscOps">Miscelleaneous ZooKeeper Operations</a>
</li>
</ul>
290
291
</li>
<li>
292
<a href="#ch_programStructureWithExample">Program Structure, with Simple Example</a>
293
294
</li>
<li>
295
<a href="#ch_gotchas">Gotchas: Common Problems and Troubleshooting</a>
296
297
298
299
300
301
302
303
304
305
</li>
</ul>
</div>
  

  

  

  
306
307
308
<a name="N1000B"></a><a name="_introduction"></a>
<h2 class="h3">Introduction</h2>
<div class="section">
309
310
311
<p>This document is a guide for developers wishing to create
    distributed applications that take advantage of ZooKeeper's coordination
    services. It contains conceptual and practical information.</p>
312
<p>The first four sections of this guide present higher level
313
314
315
    discussions of various ZooKeeper concepts. These are necessary both for an
    understanding of how Zookeeper works as well how to work with it. It does
    not contain source code, but it does assume a familiarity with the
316
    problems associated with distributed computing. The sections in this first
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
    group are:</p>
<ul>
      
<li>
        
<p>
<a href="#ch_zkDataModel">The ZooKeeper Data Model</a>
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_zkSessions">ZooKeeper Sessions</a>
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_zkWatches">ZooKeeper Watches</a>
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_zkGuarantees">Consistency Guarantees</a>
</p>
      
</li>
    
</ul>
356
<p>The next four sections of this provided practical programming
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
    information. These are:</p>
<ul>
      
<li>
        
<p>
<a href="#ch_guideToZkOperations">Building Blocks: A Guide to ZooKeeper Operations</a>
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_bindings">Bindings</a>
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_programStructureWithExample">Program Structure, with Simple Example</a>
382
        <em>[tbd]</em>
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
</p>
      
</li>

      
<li>
        
<p>
<a href="#ch_gotchas">Gotchas: Common Problems and Troubleshooting</a>
</p>
      
</li>
    
</ul>
<p>The book concludes with an <a href="#apx_linksToOtherInfo">appendix</a> containing links to other
    useful, ZooKeeper-related information.</p>
<p>Most of information in this document is written to be accessible as
    stand-alone reference material. However, before starting your first
    ZooKeeper application, you should probably at least read the chaptes on
    the <a href="#ch_zkDataModel">ZooKeeper Data Model</a> and <a href="#ch_guideToZkOperations">ZooKeeper Basic Operations</a>. Also,
    the <a href="#ch_programStructureWithExample">Simple Programmming
404
    Example</a> <em>[tbd]</em> is helpful for understand the basic
405
    structure of a ZooKeeper client application.</p>
406
</div>
407
408

  
409
<a name="N1007D"></a><a name="ch_zkDataModel"></a>
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
<h2 class="h3">The ZooKeeper Data Model</h2>
<div class="section">
<p>ZooKeeper has a hierarchal name space, much like a distributed file
    system. The only difference is that each node in the namespace can have
    data associated with it as well as children. It is like having a file
    system that allows a file to also be a directory. Paths to nodes are
    always expressed as canonical, absolute, slash-separated paths; there are
    no relative reference. Any unicode character can be used in a path subject
    to the following constraints:</p>
<ul>
      
<li>
        
<p>The null character (\u0000) cannot be part of a path name. (This
        causes problems with the C binding.)</p>
      
</li>

      
<li>
        
<p>The following characters can't be used because they don't
        display well, or render in confusing ways: \u0001 - \u0019 and \u007F
        - \u009F.</p>
      
</li>

      
<li>
        
440
441
442
<p>The following characters are not allowed: \ud800 -uF8FFF,
        \uFFF0-uFFFF, \uXFFFE - \uXFFFF (where X is a digit 1 - E), \uF0000 -
        \uFFFFF.</p>
443
444
445
446
447
448
449
      
</li>

      
<li>
        
<p>The "." character can be used as part of another name, but "."
450
        and ".." cannot alone be used to indicate a node along a path,
451
452
453
454
455
456
457
458
459
460
461
462
463
        because ZooKeeper doesn't use relative paths. The following would be
        invalid: "/a/b/./c" or "/a/b/../c".</p>
      
</li>

      
<li>
        
<p>The token "zookeeper" is reserved.</p>
      
</li>
    
</ul>
464
<a name="N100A7"></a><a name="sc_zkDataModel_znodes"></a>
465
466
467
468
469
470
471
472
473
474
475
476
<h3 class="h4">ZNodes</h3>
<p>Every node in a ZooKeeper tree is refered to as a
      <em>znode</em>. Znodes maintain a stat structure that
      includes version numbers for data changes, acl changes. The stat
      structure also has timestamps. The version number, together with the
      timestamp allow ZooKeeper to validate the cache and to coordinate
      updates. Each time a znode's data changes, the version number increases.
      For instance, whenever a client retrieves data, it also receives the
      version of the data. And when a client performs an update or a delete,
      it must supply the version of the data of the znode it is changing. If
      the version it supplies doesn't match the actual version of the data,
      the update will fail. (This behavior can be overridden. For more
477
      information see... )<em>[tbd...]</em>
478
479
480
481
482
483
484
</p>
<div class="note">
<div class="label">Note</div>
<div class="content">
        
<p>In distributed application engineering, the word
        <em>node</em> can refer to a generic host machine, a
485
        server, a member of an ensemble, a client process, etc. In the ZooKeeper
486
487
488
        documentatin, <em>znodes</em> refer to the data nodes.
        <em>Servers</em> to refer to machines that make up the
        ZooKeeper service; <em>quorum peers</em> refer to the
489
        servers that make up an ensemble; client refers to any host or process
490
491
492
493
494
495
        which uses a ZooKeeper service.</p>
      
</div>
</div>
<p>Znodes are the main enitity that a programmer access. They have
      several characteristics that are worth mentioning here.</p>
496
<a name="N100CA"></a><a name="sc_zkDataMode_watches"></a>
497
498
499
500
501
<h4>Watches</h4>
<p>Clients can set watches on znodes. Changes to that znode trigger
        the watch and then clear the watch. When a watch triggers, ZooKeeper
        sends the client a notification. More information about watches can be
        found in the section 
502
	    <a href="#ch_zkWatches">Zookeeper Watches</a>.
503
        <em>[tbd]</em>
504
</p>
505
<a name="N100DA"></a><a name="Data+Access"></a>
506
507
508
509
510
<h4>Data Access</h4>
<p>The data stored at each znode in a namespace is read and written
        atomically. Reads get all the data bytes associated with a znode and a
        write replaces all the data. Each node has an Access Control List
        (ACL) that restricts who can do what.</p>
511
<a name="N100E4"></a><a name="Ephemeral+Nodes"></a>
512
513
514
515
516
<h4>Ephemeral Nodes</h4>
<p>ZooKeeper also has the notion of ephemeral nodes. These znodes
        exists as long as the session that created the znode is active. When
        the session ends the znode is deleted. Because of this behavior
        ephemeral znodes are not allowed to have children.</p>
517
<a name="N100EE"></a><a name="Unique+Naming"></a>
518
519
520
521
522
<h4>Unique Naming</h4>
<p>Finally you create a znode, you can request that ZooKeeper
        append a monotonicly increasing counter be appended to the path name
        of the znode to be requested. This counter is unique to the parent
        znode.</p>
523
<a name="N100F9"></a><a name="sc_timeInZk"></a>
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
<h3 class="h4">Time in ZooKeeper</h3>
<p>ZooKeeper tracks time multiple ways:</p>
<ul>
        
<li>
          
<p>
<strong>Zxid</strong>
</p>

          
<p>Every change to the ZooKeeper state receives a stamp in the
          form of a <em>zxid</em> (ZooKeeper Transaction Id).
          This exposes the total ordering of all changes to ZooKeeper. Each
          change will have a unique zxid and if zxid1 is smaller than zxid2
          then zxid1 happened before zxid2.</p>
        
</li>

        
<li>
          
<p>
<strong>Version numbers</strong>
</p>

          
<p>Every change to a a node will cause an increase to one of the
          version numbers of that node. The three version numbers are version
          (number of changes to the data of a znode), cversion (number of
          changes to the children of a znode), and aversion (number of changes
          to the ACL of a znode).</p>
        
</li>

        
<li>
          
<p>
<strong>Ticks</strong>
</p>

          
<p>When using multi-server ZooKeeper, servers use ticks to define
          timing of events such as status uploads, session timeouts,
          connection timeouts between peers, etc. The tick time is only
          indirectly exposed through the minimum session timeout (2 times the
          tick time); if a client requests a session timeout less than the
          minimum session timeout, the server will tell the client that the
          session timeout is actually the minimum session timeout.</p>
        
</li>

        
<li>
          
<p>
<strong>Real time</strong>
</p>

          
<p>ZooKeeper doesn't use real time, or clock time, at all except
          to put timestamps into the stat structure on znode creation and
          znode modification.</p>
        
</li>
      
</ul>
592
<a name="N10131"></a><a name="sc_zkStatStructure"></a>
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
<h3 class="h4">ZooKeeper Stat Structure</h3>
<p>The Stat structure for each znode in ZooKeeper is made up of the
      following fields:</p>
<ul>
        
<li>
          
<p>
<strong>czxid</strong>
</p>

          
<p>The zxid of the change that caused this znode to be
          created.</p>
        
</li>

        
<li>
          
<p>
<strong>mzxid</strong>
</p>

          
<p>The zxid of the change that last modified this znode.</p>
        
</li>

        
<li>
          
<p>
<strong>ctime</strong>
</p>

          
<p>The time in milliseconds from epoch when this znode was
          created.</p>
        
</li>

        
<li>
          
<p>
<strong>mtime</strong>
</p>

          
<p>The time in milliseconds from epoch when this znode was last
          modified.</p>
        
</li>

        
<li>
          
<p>
<strong>version</strong>
</p>

          
<p>The number of changes to the data of this znode.</p>
        
</li>

        
<li>
          
<p>
<strong>cversion</strong>
</p>

          
<p>The number of changes to the children of this znode.</p>
        
</li>

        
<li>
          
<p>
<strong>aversion</strong>
</p>

          
<p>The number of changes to the ACL of this znode.</p>
        
</li>

        
<li>
          
<p>
<strong>ephemeralOwner</strong>
</p>

          
<p>The session id of the owner of this znode if the znode is an
          ephemeral node. If it is not an ephemeral node, it will be
          zero.</p>
        
</li>
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721

        
<li>
          
<p>
<strong>dataLength</strong>
</p>

          
<p>The length of the data field of this znode.</p>
        
</li>

        
<li>
          
<p>
<strong>numChildren</strong>
</p>

          
<p>The number of children of this znode.</p>
        
</li>

722
723
724
725
726
      
</ul>
</div>

  
727
<a name="N101A3"></a><a name="ch_zkSessions"></a>
728
729
730
731
732
733
734
<h2 class="h3">ZooKeeper Sessions</h2>
<div class="section">
<p>When a client gets a handle to the ZooKeeper service, ZooKeeper
    creates a ZooKeeper session, represented as a 64-bit number, that it
    assigns to the client. If the client connects to a different ZooKeeper
    server, it will send the session id as a part of the connection handshake.
    As a security measure, the server creates a password for the session id
735
    that any ZooKeeper server can validate.The password is sent to the client with the session id when the
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
    client establishes the session. The client sends this password with the
    session id whenever it reestablishes the session with a new server.</p>
<p>One of the parameters to the ZooKeeper client library call to create
    a ZooKeeper session is the session timeout in milliseconds. The client
    sends a requested timeout, the server responds with the timeout that it
    can give the client. The current implementation requires that the timeout
    be between 2 times the tickTime (as set in the server configuration) and
    60 seconds.</p>
<p>The session is kept alive by requests sent by the client. If the
    session is idle for a period of time that would timeout the session, the
    client will send a PING request to keep the session alive. This PING
    request not only allows the ZooKeeper server to know that the client is
    still active, but it also allows the client to verify that its connection
    to the ZooKeeper server is still active. The timing of the PING is
    conservative enough to ensure reasonable time to detect a dead connection
    and reconnect to a new server.</p>
</div>

  
755
<a name="N101B3"></a><a name="ch_zkWatches"></a>
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
<h2 class="h3">ZooKeeper Watches</h2>
<div class="section">
<p>All of the read operations in ZooKeeper - <strong>getData()</strong>, <strong>getChildren()</strong>, and <strong>exists()</strong> - have the option of setting a watch as a
    side effect. Here is ZooKeeper's definition of a watch: a watch event is
    one-time trigger, sent to the client that set the watch, which occurs when
    the data for which the watch was set changes. There are three key points
    to consider in this definition of a watch:</p>
<ul>
      
<li>
        
<p>
<strong>One-time trigger</strong>
</p>

        
<p>One watch event will be sent to the client the data has changed.
        For example, if a client does a getData("/znode1", true) and later the
        data for /znode1 is changed or deleted, the client will get a watch
        event for /znode1. If /znode1 changes again, no watch event will be
        sent unless the client has done another read that sets a new
        watch.</p>
      
</li>

      
<li>
        
<p>
<strong>Sent to the client</strong>
</p>

        
<p>This implies that an event is on the way to the client, but may
        not reach the client before the successful return code to the change
        operation reaches the client that initiated the change. Watches are
        sent asynchronously to watchers. ZooKeeper provides an ordering
        guarantee: a client will never see a change for which it has set a
        watch until it first sees the watch event. Network delays or other
        factors may cause different clients to see watches and return codes
        from updates at different times. The key point is that everything seen
        by the different clients will have a consistent order.</p>
      
</li>

      
<li>
        
<p>
<strong>The data for which the watch was
        set</strong>
</p>

        
<p>This refers to the different ways a node can change. ZooKeeper
        maintains two lists of watches: data watches and child watches.
        getData() and exists() set data watches. getChildren() sets child
        watches. Thus, setData() will trigger data watches for the znode being
        set (assuming the set is successful). A successful create() will
        trigger a data watch for the znode being created and a child watch for
        the parent znode. A successful delete() will trigger both a data watch
        and a child watch (since there can be no more children) for a znode
        being deleted as well as a child watch for the parent znode.</p>
      
</li>
    
</ul>
<p>Watches are maintained locally at the ZooKeeper server to which the
    client is connected. This allows watches to be light weight to set,
825
826
827
828
829
830
831
    maintain, and dispatch. When a client connects to a new server, the watch
    will be triggered for any session events. Watches will not be received
    while disconnected from a server. When a client reconnects, any previously
    registered watches will be reregistered and triggered if needed. In
    general this all occurs transparently. There is one case where a watch
    may be missed: a watch for the existance of a znode not yet created will
    be missed if the znode is created and deleted while disconnected.</p>
832
<a name="N101E9"></a><a name="sc_WatchGuarantees"></a>
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
<h3 class="h4">What ZooKeeper Guarantees about Watches</h3>
<p>With regard to watches, ZooKeeper maintains these
      guarantees:</p>
<ul>
        
<li>
          
<p>Watches are ordered with respect to other events, other
          watches, and asynchronous replies. The ZooKeeper client libraries
          ensures that everything is dispatched in order.</p>
        
</li>
      
</ul>
<ul>
        
<li>
          
<p>A client will see a watch event for a znode it is watching
          before seeing the new data that corresponds to that znode.</p>
        
</li>
      
</ul>
<ul>
        
<li>
          
<p>The order of watch events from ZooKeeper corresponds to the
          order of the updates as seen by the ZooKeeper service.</p>
        
</li>
      
</ul>
867
<a name="N1020E"></a><a name="sc_WatchRememberThese"></a>
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
<h3 class="h4">Things to Remember about Watches</h3>
<ul>
        
<li>
          
<p>Watches are one time triggers; if you get a watch event and
          you want to get notified of future changes, you must set another
          watch.</p>
        
</li>
      
</ul>
<ul>
        
<li>
          
<p>Because watches are one time triggers and there is latency
          between getting the event and sending a new request to get a watch
          you cannot reliably see every change that happens to a node in
          ZooKeeper. Be prepared to handle the case where the znode changes
          multiple times between getting the event and setting the watch
          again. (You may not care, but at least realize it may
          happen.)</p>
        
</li>
      
</ul>
<ul>
        
<li>
          
899
900
901
902
903
904
905
906
907
908
909
910
911
912
<p>A watch object, or function/context pair, will only be
          triggered once for a given notification. For example, if the same
          watch object is registered for an exists and a getData call for the
          same file and that file is then deleted, the watch object would
          only be invoked once with the deletion notification for the file.
          </p>
        
</li>
      
</ul>
<ul>
        
<li>
          
913
<p>When you disconnect from a server (for example, when the
914
915
916
917
918
          server fails), you will not get any watches until the connection
          is reestablished. For this reason session events are sent to all
          outstanding watch handlers. Use session events to go into a safe
          mode: you will not be receiving events while disconnected, so your
          process should act conservatively in that mode.</p>
919
920
921
922
923
924
925
        
</li>
      
</ul>
</div>

  
926
<a name="N1023A"></a><a name="sc_ZooKeeperAccessControl"></a>
927
928
929
930
931
<h2 class="h3">ZooKeeper access control using ACLs</h2>
<div class="section">
<p>ZooKeeper uses ACLs to control access to its znodes (the data nodes of a ZooKeeper data tree). The ACL implementation is quite similar to UNIX file access permissions: it employs permission bits to allow/disallow various operations against a node and the scope to which the bits apply. Unlike standard UNIX permissions, a ZooKeeper node is not limited by the three standard scopes for user (owner of the file), group, and world (other). ZooKeeper does not have a notion of an owner of a znode. Instead, an ACL specifies sets of ids and permissions that are associated with those ids.</p>
<p>ZooKeeper supports pluggable authentication schemes. Ids are specified using the form <em>scheme:id</em>, where <em>scheme</em> is a the authentication scheme that the id corresponds to. For example, <em>host:host1.corp.com</em> is an id for a host named <em>host1.corp.com</em>.</p>
<p>When a client connects to ZooKeeper and authenticates itself, ZooKeeper associates all the ids that correspond to a client with the clients connection. These ids are checked against the ACLs of znodes when a clients tries to access a node. ACLs are made up of pairs of <em>(scheme:expression, perms)</em>. The format of the <em>expression</em> is specific to the scheme. For example, the pair <em>(ip:19.22.0.0/16, READ)</em> gives the <em>READ</em> permission to any clients with an IP address that starts with 19.22.</p>
932
<a name="N10261"></a><a name="sc_ACLPermissions"></a>
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
<h3 class="h4">ACL Permissions</h3>
<p>Zookeeper supports the following permissions:</p>
<ul>
        
<li>
<p>
<strong>CREATE</strong>: you can create a child node</p>
</li>
        
<li>
<p>
<strong>READ</strong>: you can get data from a node and list its children.</p>
</li>
        
<li>
<p>
<strong>WRITE</strong>: you can set data for a node</p>
</li>
        
<li>
<p>
<strong>DELETE</strong>: you can delete a child node</p>
</li>
        
<li>
<p>
<strong>ADMIN</strong>: you can set permissions</p>
</li>
      
</ul>
<p>The <em>CREATE</em> and <em>DELETE</em> permissions have been broken out of the <em>WRITE</em> permission for finer grained access controls. The cases for <em>CREATE</em> and <em>DELETE</em> are the following:</p>
<p>You want A to be able to do a set on a zookeeper node, but not be able to <em>CREATE</em> or <em>DELETE</em> children.</p>
<p>
<em>CREATE</em> without <em>DELETE</em>: clients create requests by creating zookeeper nodes in a parent directory. You want all clients to be able to add, but only request processor can delete. (This is kind of like the APPEND permission for files.)</p>
<p>Also, the <em>ADMIN</em> permission is there since Zookeeper doesn&rsquo;t have a notion of file owner. In some sense the <em>ADMIN</em> permission designates the entity as the owner. Zookeeper doesn&rsquo;t support the LOOKUP permission (execute permission bit on directories to allow you to LOOKUP even though you can't list the directory). Everyone implicitly has LOOKUP permission. This allows you to stat a node, but nothing more. (The problem is, if you want to call zoo_exists() on a node that doesn't exist, there is no permission to check.)</p>
968
<a name="N102B7"></a><a name="sc_BuiltinACLSchemes"></a>
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
<h4>Builtin ACL Schemes</h4>
<p>ZooKeeeper has the following built in schemes:</p>
<ul>
        
<li>
<p>
<strong>world</strong> has a single id, <em>anyone</em>, that represents anyone.</p>
</li>
        
<li>
<p>
<strong>auth</strong> doesn't use any id, represents any authenticated user.</p>
</li>
        
<li>
<p>
<strong>digest</strong> uses a <em>username:password</em> string to generate MD5 hash which is then used as an ACL ID identity. Authentication is done by sending the <em>username:password</em> in clear text. When used in the ACL the expression will be the <em>username:base64</em>encoded<em>SHA1</em>password<em>digest</em>.</p>
</li>
        
<li>
<p>
<strong>host</strong> uses the client host name as an ACL ID identity. The ACL expression is a hostname suffix. For example, the ACL expression <em>host:corp.com</em> matches the ids <em>host:host1.corp.com</em> and <em>host:host2.corp.com</em>, but not <em>host:host1.store.com</em>.</p>
</li>
        
<li>
<p>
<strong>ip</strong> uses the client host IP as an ACL ID identity. The ACL expression is of the form <em>addr/bits</em> where the most significant <em>bits</em> of <em>addr</em> are matched against the most significant <em>bits</em> of the client host IP.</p>
</li>
      
</ul>
999
<a name="N1030C"></a><a name="Zookeeper+C+client+API"></a>
1000
<h4>Zookeeper C client API</h4>
For faster browsing, not all history is shown. View entire blame