author | Yiteng Zhang <yiteng.zhang@oracle.com> |
Wed, 09 Mar 2016 11:27:23 -0800 | |
changeset 3321 | 52e8eec3014c |
parent 2690 | 11a8cae074e0 |
permissions | -rw-r--r-- |
2690
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
1 |
.. This document is formatted using reStructuredText, which is a Markup |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
2 |
Syntax and Parser Component of Docutils for Python. An html version |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
3 |
of this document can be generated using the following command: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
4 |
rst2html.py doc/parallel-linked-images.txt >doc/parallel-linked-images.html |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
5 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
6 |
====================== |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
7 |
Parallel Linked Images |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
8 |
====================== |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
9 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
10 |
:Author: Edward Pilatowicz |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
11 |
:Version: 0.1 |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
12 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
13 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
14 |
Problems |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
15 |
======== |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
16 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
17 |
Currently linked image recursion is done serially and in stages. For |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
18 |
example, when we perform an "pkg update" on an image then for each child |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
19 |
image we will execute multiple pkg.1 cli operations. The multiple pkg.1 |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
20 |
invocations on a single child image correspond with the following |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
21 |
sequential stages of pkg.1 execution: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
22 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
23 |
1) publisher check: sanity check child publisher configuration against |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
24 |
parent publisher configuration. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
25 |
2) planning: plan fmri and action changes. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
26 |
3) preparation: download content needed to execute planned changes. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
27 |
4) execution: execute planned changes. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
28 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
29 |
So to update an image with children, we invoke pkg.1 four times for each |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
30 |
child image. This architecture is inefficient for multiple reasons: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
31 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
32 |
- we don't do any operations on child images in parallel |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
33 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
34 |
- when executing multiple pkg.1 invocations to perform a single |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
35 |
operation on a child image, we are constantly throwing out and |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
36 |
re-initializing lots of pkg.1 state. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
37 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
38 |
To make matters worse, when as we execute stages 3 and 4 on a child |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
39 |
image the pkg client also re-executes previous stages. For example, |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
40 |
when we start stage 4 (execution) we re-execute stages 2 and 3. So for |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
41 |
each child we update we end up invoking stage 2 three times, and stage 3 |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
42 |
twice. This leads to bugs like 18393 (where it seems that we download |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
43 |
packages twice). It also means that we have caching code buried within |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
44 |
the packaging system that attempts to cache internal state to disk in an |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
45 |
effort to speed up subsequent re-runs of previous stages. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
46 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
47 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
48 |
Solutions |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
49 |
========= |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
50 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
51 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
52 |
Eliminate duplicate work |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
53 |
------------------------ |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
54 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
55 |
We want to eliminate a lot of the duplicate work done when executing |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
56 |
packaging operations on children in stages. To do this we will update |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
57 |
the pkg client api to allow callers to: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
58 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
59 |
- Save an image plan to disk. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
60 |
- Load an image plan from disk. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
61 |
- Execute a loaded plan from disk without first "preparing" it. (This |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
62 |
assumes that the caller has already "prepared" the plan in a previous |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
63 |
invocation.) |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
64 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
65 |
In addition to eliminating duplicated work during staged execution, this |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
66 |
will also allow us to stop caching intermediate state internally within |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
67 |
the package system. Instead client.py will be enhanced to cache the |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
68 |
image plan and it will be the only component that knows about "staging". |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
69 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
70 |
To allow us to save and restore plans, all image plan data will be saved |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
71 |
within a PlanDescription object, and we will support serializing this |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
72 |
object into a json format. The json format for saved image plans is an |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
73 |
internal, unstable, and unversioned private interface. We will not |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
74 |
support saving an image plan to disk and then executing it later with a |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
75 |
different version of the packaging system on a different host. Also, |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
76 |
even though we will be adding data into the PlanDescription object we |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
77 |
will also not be exposing any new information about an image plan to via |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
78 |
the PlanDescription object to api consumers. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
79 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
80 |
An added advantage of allowing api consumers to save an image plan to |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
81 |
disk is that it should help with our plans to have the api.gen_plan_*() |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
82 |
functions to be able to return PlanDescription object for child images. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
83 |
A file descriptor (or path) associated with a saved image plan would be |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
84 |
one way for child images to pass image plans back to their parent (which |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
85 |
could then load them and yield them as results to api.gen_plan_*()). |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
86 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
87 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
88 |
Update children in parallel |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
89 |
--------------------------- |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
90 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
91 |
We want to enhance the package client so that it can update child images |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
92 |
in parallel. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
93 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
94 |
Due to potential resource constraints (cpu, memory, and disk io) we |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
95 |
cannot entirely remove the ability to operate on child images serially. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
96 |
Instead, we plan to allow for a concurrency setting that specifies how |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
97 |
many child images we are willing to update in parallel. By default when |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
98 |
operating on child images we will use a concurrency setting of 1, this |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
99 |
maintains the current behavior of the packaging system. If a user wants |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
100 |
to specify a higher concurrency setting, they can use the "-C N" option |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
101 |
to subcommands that recurse (like "install", "update", etc) or they can |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
102 |
set the environment variable "PKG_CONCURRENCY=N". (In both cases N is |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
103 |
an integer which specifies the desired concurrency level.) |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
104 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
105 |
Currently, pkg.1 worker subprocesses are invoked via the pkg.1 cli |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
106 |
interfaces. When switching to parallel execution this will be changed |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
107 |
to use a json encoded rpc execution model. This richer interface is |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
108 |
needed to allow worker processes to pause and resume execution between |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
109 |
stages so that we can do multi-staged operations in a single process. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
110 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
111 |
Unfortunately, the current implementation does not yet retain child |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
112 |
processes across different stages of execution. Instead, whenever we |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
113 |
start a new stage of execution, we spawn one process for each child |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
114 |
images, then we make a remote procedure call into N images at once |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
115 |
(where N is our concurrency level). When an RPC returns, that child |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
116 |
process exits and we start a call for the next available child. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
117 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
118 |
Ultimately, we'd like to move to model where we have a pool of N worker |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
119 |
processes, and those processes can operate on different images as |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
120 |
necessary. These processes would be persistent across all stages of |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
121 |
execution, and ideally, when moving from one stage to another these |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
122 |
processes could cache in memory the state for at least N child images so |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
123 |
that the processes could simply resume execution where they last left |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
124 |
off. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
125 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
126 |
The client side of this rpc interface will live in a new module called |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
127 |
PkgRemote. The linked image subsystem will use the PkgRemote module to |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
128 |
initiate operations on child images. One PkgRemote instance will be |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
129 |
allocated for each child that we are operating on. Currently, this |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
130 |
PkgRemote module will only support the sync and update operations used |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
131 |
within linked images, but in the future it could easily be expanded to |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
132 |
support other remote pkg.1 operations so that we can support recursive |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
133 |
linked image operations (see 7140357). When PkgRemote invokes an |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
134 |
operation on a child image it will fork off a new pkg.1 worker process |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
135 |
as follows: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
136 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
137 |
pkg -R /path/to/linked/image remote --ctlfd=5 |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
138 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
139 |
this new pkg.1 worker process will function as an rpc server which the |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
140 |
client will make requests to. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
141 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
142 |
Rpc communication between the client and server will be done via json |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
143 |
encoded rpc. These requests will be sent between the client and server |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
144 |
via a pipe. The communication pipe is created by the client, and its |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
145 |
file descriptor is passed to the server via fork/exec. The server is |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
146 |
told about the pipe file descriptor via the --ctlfd parameter. To avoid |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
147 |
issues with blocking IO, all communication via this pipe will be done by |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
148 |
passing file descriptors. For example, if the client wants to send a |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
149 |
rpc request to the server, it will write that rpc request into a |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
150 |
temporary file and then send the fd associated with the temporary file |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
151 |
over the pipe. Any reply from the server will be similarly serialized |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
152 |
and then sent via a file descriptor over the pipe. This should ensure |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
153 |
that no matter the size of the request or the response, we will not |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
154 |
block when sending or receiving requests via the pipe. (Currently, the |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
155 |
limit of fds that can be queued in a pipe is around 700. Given that our |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
156 |
rpc model includes matched requests and responses, it seems unlikely |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
157 |
that we'd ever hit this limit.) |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
158 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
159 |
In the pkg.1 worker server process, we will have a simple json rpc |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
160 |
server that lives within client.py. This server will listen for |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
161 |
requests from the client and invoke client.py subcommand interfaces |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
162 |
(like update()). The client.py subcommand interfaces were chosen to be |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
163 |
the target for remote interfaces for rpc calls for the following |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
164 |
reasons: |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
165 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
166 |
- Least amount of encoding / decoding. Since these interfaces are |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
167 |
invoked just after parsing user arguments, they mostly involve simple |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
168 |
arguments (strings, integers, etc) which have a direct json encoding. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
169 |
Additionally, the return values from these calls are simple return |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
170 |
code integers, not objects, which means the results are also easy to |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
171 |
encode. This means that we don't need lots of extra serialization / |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
172 |
de-serialization logic (for things like api exceptions, etc). |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
173 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
174 |
- Output and exception handling. The client.py interfaces already |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
175 |
handle exceptions and output for the client. This means that we don't |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
176 |
have to create new output classes and build our own output and |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
177 |
exception management handling code, instead we leverage the existing |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
178 |
code. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
179 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
180 |
- Future recursion support. Currently when recursing into child images |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
181 |
we only execute "sync" and "update" operations. Eventually we want to |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
182 |
support pkg.1 subcommand recursion into linked images (see 7140357) |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
183 |
for many more operations. If we do this, the client.py interfaces |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
184 |
provide a nice boundary since there will be an almost 1:1 mapping |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
185 |
between parent and child subcommand operations. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
186 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
187 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
188 |
Child process output and progress management |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
189 |
-------------------------------------------- |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
190 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
191 |
Currently, since child execution happens serially, all child images have |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
192 |
direct access to standard out and display their progress directly there. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
193 |
Once we start updating child images in parallel this will no longer be |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
194 |
possible. Instead, all output from children will be logged to temporary |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
195 |
files and displayed by the parent when a child completes a given stage |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
196 |
of execution. |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
197 |
|
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
198 |
Additionally, since child images will no longer have access to standard |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
199 |
out, we will need a new mechanism to indicate progress while operating |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
200 |
on child images. To do this we will have a progress pipe between each |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
201 |
parent and child image. The child image will write one byte to this |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
202 |
pipe whenever one of the ProgressTracker`*_progress() interfaces are |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
203 |
invoked. The parent process can read from this pipe to detect progress |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
204 |
within children and update its user visible progress tracker |
11a8cae074e0
7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff
changeset
|
205 |
accordingly. |