doc/parallel-linked-images.txt
author Yiteng Zhang <yiteng.zhang@oracle.com>
Wed, 09 Mar 2016 11:27:23 -0800
changeset 3321 52e8eec3014c
parent 2690 11a8cae074e0
permissions -rw-r--r--
17377205 IPS should not use M2Crypto 22332625 test suite should test signing certs with unsupported extensions 16718631 pkg verify traceback "AttributeError: 'int' object has no attribute 'check__ca'"
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
2690
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     1
.. This document is formatted using reStructuredText, which is a Markup
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     2
   Syntax and Parser Component of Docutils for Python.  An html version
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     3
   of this document can be generated using the following command:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     4
     rst2html.py doc/parallel-linked-images.txt >doc/parallel-linked-images.html
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     5
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     6
======================
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     7
Parallel Linked Images
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     8
======================
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
     9
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    10
:Author: Edward Pilatowicz
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    11
:Version: 0.1
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    12
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    13
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    14
Problems
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    15
========
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    16
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    17
Currently linked image recursion is done serially and in stages.  For
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    18
example, when we perform an "pkg update" on an image then for each child
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    19
image we will execute multiple pkg.1 cli operations.  The multiple pkg.1
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    20
invocations on a single child image correspond with the following
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    21
sequential stages of pkg.1 execution:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    22
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    23
1) publisher check: sanity check child publisher configuration against
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    24
   parent publisher configuration.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    25
2) planning: plan fmri and action changes.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    26
3) preparation: download content needed to execute planned changes.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    27
4) execution: execute planned changes.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    28
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    29
So to update an image with children, we invoke pkg.1 four times for each
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    30
child image.  This architecture is inefficient for multiple reasons:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    31
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    32
- we don't do any operations on child images in parallel
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    33
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    34
- when executing multiple pkg.1 invocations to perform a single
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    35
  operation on a child image, we are constantly throwing out and
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    36
  re-initializing lots of pkg.1 state.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    37
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    38
To make matters worse, when as we execute stages 3 and 4 on a child
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    39
image the pkg client also re-executes previous stages.  For example,
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    40
when we start stage 4 (execution) we re-execute stages 2 and 3.  So for
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    41
each child we update we end up invoking stage 2 three times, and stage 3
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    42
twice.  This leads to bugs like 18393 (where it seems that we download
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    43
packages twice).  It also means that we have caching code buried within
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    44
the packaging system that attempts to cache internal state to disk in an
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    45
effort to speed up subsequent re-runs of previous stages.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    46
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    47
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    48
Solutions
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    49
=========
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    50
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    51
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    52
Eliminate duplicate work
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    53
------------------------
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    54
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    55
We want to eliminate a lot of the duplicate work done when executing
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    56
packaging operations on children in stages.  To do this we will update
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    57
the pkg client api to allow callers to:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    58
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    59
- Save an image plan to disk.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    60
- Load an image plan from disk.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    61
- Execute a loaded plan from disk without first "preparing" it.  (This
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    62
  assumes that the caller has already "prepared" the plan in a previous
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    63
  invocation.)
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    64
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    65
In addition to eliminating duplicated work during staged execution, this
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    66
will also allow us to stop caching intermediate state internally within
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    67
the package system.  Instead client.py will be enhanced to cache the
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    68
image plan and it will be the only component that knows about "staging".
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    69
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    70
To allow us to save and restore plans, all image plan data will be saved
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    71
within a PlanDescription object, and we will support serializing this
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    72
object into a json format.  The json format for saved image plans is an
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    73
internal, unstable, and unversioned private interface.  We will not
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    74
support saving an image plan to disk and then executing it later with a
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    75
different version of the packaging system on a different host.  Also,
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    76
even though we will be adding data into the PlanDescription object we
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    77
will also not be exposing any new information about an image plan to via
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    78
the PlanDescription object to api consumers.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    79
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    80
An added advantage of allowing api consumers to save an image plan to
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    81
disk is that it should help with our plans to have the api.gen_plan_*()
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    82
functions to be able to return PlanDescription object for child images.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    83
A file descriptor (or path) associated with a saved image plan would be
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    84
one way for child images to pass image plans back to their parent (which
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    85
could then load them and yield them as results to api.gen_plan_*()).
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    86
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    87
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    88
Update children in parallel
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    89
---------------------------
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    90
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    91
We want to enhance the package client so that it can update child images
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    92
in parallel.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    93
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    94
Due to potential resource constraints (cpu, memory, and disk io) we
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    95
cannot entirely remove the ability to operate on child images serially.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    96
Instead, we plan to allow for a concurrency setting that specifies how
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    97
many child images we are willing to update in parallel.  By default when
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    98
operating on child images we will use a concurrency setting of 1, this
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
    99
maintains the current behavior of the packaging system.  If a user wants
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   100
to specify a higher concurrency setting, they can use the "-C N" option
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   101
to subcommands that recurse (like "install", "update", etc) or they can
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   102
set the environment variable "PKG_CONCURRENCY=N".  (In both cases N is
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   103
an integer which specifies the desired concurrency level.)
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   104
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   105
Currently, pkg.1 worker subprocesses are invoked via the pkg.1 cli
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   106
interfaces.  When switching to parallel execution this will be changed
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   107
to use a json encoded rpc execution model.  This richer interface is
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   108
needed to allow worker processes to pause and resume execution between
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   109
stages so that we can do multi-staged operations in a single process.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   110
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   111
Unfortunately, the current implementation does not yet retain child
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   112
processes across different stages of execution.  Instead, whenever we
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   113
start a new stage of execution, we spawn one process for each child
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   114
images, then we make a remote procedure call into N images at once
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   115
(where N is our concurrency level).  When an RPC returns, that child
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   116
process exits and we start a call for the next available child.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   117
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   118
Ultimately, we'd like to move to model where we have a pool of N worker
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   119
processes, and those processes can operate on different images as
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   120
necessary.  These processes would be persistent across all stages of
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   121
execution, and ideally, when moving from one stage to another these
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   122
processes could cache in memory the state for at least N child images so
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   123
that the processes could simply resume execution where they last left
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   124
off.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   125
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   126
The client side of this rpc interface will live in a new module called
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   127
PkgRemote.  The linked image subsystem will use the PkgRemote module to
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   128
initiate operations on child images.  One PkgRemote instance will be
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   129
allocated for each child that we are operating on.  Currently, this
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   130
PkgRemote module will only support the sync and update operations used
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   131
within linked images, but in the future it could easily be expanded to
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   132
support other remote pkg.1 operations so that we can support recursive
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   133
linked image operations (see 7140357).  When PkgRemote invokes an
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   134
operation on a child image it will fork off a new pkg.1 worker process
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   135
as follows:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   136
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   137
	pkg -R /path/to/linked/image remote --ctlfd=5
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   138
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   139
this new pkg.1 worker process will function as an rpc server which the
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   140
client will make requests to.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   141
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   142
Rpc communication between the client and server will be done via json
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   143
encoded rpc.  These requests will be sent between the client and server
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   144
via a pipe.  The communication pipe is created by the client, and its
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   145
file descriptor is passed to the server via fork/exec.  The server is
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   146
told about the pipe file descriptor via the --ctlfd parameter.  To avoid
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   147
issues with blocking IO, all communication via this pipe will be done by
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   148
passing file descriptors.  For example, if the client wants to send a
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   149
rpc request to the server, it will write that rpc request into a
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   150
temporary file and then send the fd associated with the temporary file
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   151
over the pipe.  Any reply from the server will be similarly serialized
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   152
and then sent via a file descriptor over the pipe.  This should ensure
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   153
that no matter the size of the request or the response, we will not
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   154
block when sending or receiving requests via the pipe.  (Currently, the
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   155
limit of fds that can be queued in a pipe is around 700.  Given that our
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   156
rpc model includes matched requests and responses, it seems unlikely
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   157
that we'd ever hit this limit.)
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   158
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   159
In the pkg.1 worker server process, we will have a simple json rpc
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   160
server that lives within client.py.  This server will listen for
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   161
requests from the client and invoke client.py subcommand interfaces
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   162
(like update()).  The client.py subcommand interfaces were chosen to be
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   163
the target for remote interfaces for rpc calls for the following
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   164
reasons:
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   165
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   166
- Least amount of encoding / decoding.  Since these interfaces are
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   167
  invoked just after parsing user arguments, they mostly involve simple
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   168
  arguments (strings, integers, etc) which have a direct json encoding.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   169
  Additionally, the return values from these calls are simple return
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   170
  code integers, not objects, which means the results are also easy to
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   171
  encode.  This means that we don't need lots of extra serialization /
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   172
  de-serialization logic (for things like api exceptions, etc).
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   173
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   174
- Output and exception handling.  The client.py interfaces already
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   175
  handle exceptions and output for the client.  This means that we don't
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   176
  have to create new output classes and build our own output and
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   177
  exception management handling code, instead we leverage the existing
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   178
  code.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   179
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   180
- Future recursion support.  Currently when recursing into child images
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   181
  we only execute "sync" and "update" operations.  Eventually we want to
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   182
  support pkg.1 subcommand recursion into linked images (see 7140357)
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   183
  for many more operations.  If we do this, the client.py interfaces
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   184
  provide a nice boundary since there will be an almost 1:1 mapping
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   185
  between parent and child subcommand operations.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   186
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   187
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   188
Child process output and progress management
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   189
--------------------------------------------
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   190
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   191
Currently, since child execution happens serially, all child images have
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   192
direct access to standard out and display their progress directly there.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   193
Once we start updating child images in parallel this will no longer be
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   194
possible.  Instead, all output from children will be logged to temporary
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   195
files and displayed by the parent when a child completes a given stage
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   196
of execution.
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   197
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   198
Additionally, since child images will no longer have access to standard
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   199
out, we will need a new mechanism to indicate progress while operating
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   200
on child images.  To do this we will have a progress pipe between each
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   201
parent and child image.  The child image will write one byte to this
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   202
pipe whenever one of the ProgressTracker`*_progress() interfaces are
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   203
invoked.  The parent process can read from this pipe to detect progress
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   204
within children and update its user visible progress tracker
11a8cae074e0 7140224 package downloaded messages displayed twice for each zone
Edward Pilatowicz <edward.pilatowicz@oracle.com>
parents:
diff changeset
   205
accordingly.