author | Casper H.S. Dik <Casper.Dik@Sun.COM> |
Wed, 28 Apr 2010 10:01:37 +0200 | |
changeset 12273 | 63678502e95e |
parent 11861 | a63258283f8f |
child 12633 | 9f2cda0ed938 |
permissions | -rw-r--r-- |
0 | 1 |
/* |
2 |
* CDDL HEADER START |
|
3 |
* |
|
4 |
* The contents of this file are subject to the terms of the |
|
1676 | 5 |
* Common Development and Distribution License (the "License"). |
6 |
* You may not use this file except in compliance with the License. |
|
0 | 7 |
* |
8 |
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE |
|
9 |
* or http://www.opensolaris.org/os/licensing. |
|
10 |
* See the License for the specific language governing permissions |
|
11 |
* and limitations under the License. |
|
12 |
* |
|
13 |
* When distributing Covered Code, include this CDDL HEADER in each |
|
14 |
* file and include the License file at usr/src/OPENSOLARIS.LICENSE. |
|
15 |
* If applicable, add the following below this CDDL HEADER, with the |
|
16 |
* fields enclosed by brackets "[]" replaced with your own identifying |
|
17 |
* information: Portions Copyright [yyyy] [name of copyright owner] |
|
18 |
* |
|
19 |
* CDDL HEADER END |
|
20 |
*/ |
|
390 | 21 |
|
0 | 22 |
/* |
12273
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
23 |
* Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. |
0 | 24 |
*/ |
25 |
||
26 |
/* |
|
27 |
* Zones |
|
28 |
* |
|
29 |
* A zone is a named collection of processes, namespace constraints, |
|
30 |
* and other system resources which comprise a secure and manageable |
|
31 |
* application containment facility. |
|
32 |
* |
|
33 |
* Zones (represented by the reference counted zone_t) are tracked in |
|
34 |
* the kernel in the zonehash. Elsewhere in the kernel, Zone IDs |
|
35 |
* (zoneid_t) are used to track zone association. Zone IDs are |
|
36 |
* dynamically generated when the zone is created; if a persistent |
|
37 |
* identifier is needed (core files, accounting logs, audit trail, |
|
38 |
* etc.), the zone name should be used. |
|
39 |
* |
|
40 |
* |
|
41 |
* Global Zone: |
|
42 |
* |
|
43 |
* The global zone (zoneid 0) is automatically associated with all |
|
44 |
* system resources that have not been bound to a user-created zone. |
|
45 |
* This means that even systems where zones are not in active use |
|
46 |
* have a global zone, and all processes, mounts, etc. are |
|
47 |
* associated with that zone. The global zone is generally |
|
48 |
* unconstrained in terms of privileges and access, though the usual |
|
49 |
* credential and privilege based restrictions apply. |
|
50 |
* |
|
51 |
* |
|
52 |
* Zone States: |
|
53 |
* |
|
54 |
* The states in which a zone may be in and the transitions are as |
|
55 |
* follows: |
|
56 |
* |
|
57 |
* ZONE_IS_UNINITIALIZED: primordial state for a zone. The partially |
|
58 |
* initialized zone is added to the list of active zones on the system but |
|
59 |
* isn't accessible. |
|
60 |
* |
|
5880 | 61 |
* ZONE_IS_INITIALIZED: Initialization complete except the ZSD callbacks are |
62 |
* not yet completed. Not possible to enter the zone, but attributes can |
|
63 |
* be retrieved. |
|
64 |
* |
|
0 | 65 |
* ZONE_IS_READY: zsched (the kernel dummy process for a zone) is |
66 |
* ready. The zone is made visible after the ZSD constructor callbacks are |
|
67 |
* executed. A zone remains in this state until it transitions into |
|
68 |
* the ZONE_IS_BOOTING state as a result of a call to zone_boot(). |
|
69 |
* |
|
70 |
* ZONE_IS_BOOTING: in this shortlived-state, zsched attempts to start |
|
71 |
* init. Should that fail, the zone proceeds to the ZONE_IS_SHUTTING_DOWN |
|
72 |
* state. |
|
73 |
* |
|
74 |
* ZONE_IS_RUNNING: The zone is open for business: zsched has |
|
75 |
* successfully started init. A zone remains in this state until |
|
76 |
* zone_shutdown() is called. |
|
77 |
* |
|
78 |
* ZONE_IS_SHUTTING_DOWN: zone_shutdown() has been called, the system is |
|
79 |
* killing all processes running in the zone. The zone remains |
|
80 |
* in this state until there are no more user processes running in the zone. |
|
81 |
* zone_create(), zone_enter(), and zone_destroy() on this zone will fail. |
|
82 |
* Since zone_shutdown() is restartable, it may be called successfully |
|
83 |
* multiple times for the same zone_t. Setting of the zone's state to |
|
84 |
* ZONE_IS_SHUTTING_DOWN is synchronized with mounts, so VOP_MOUNT() may check |
|
85 |
* the zone's status without worrying about it being a moving target. |
|
86 |
* |
|
87 |
* ZONE_IS_EMPTY: zone_shutdown() has been called, and there |
|
88 |
* are no more user processes in the zone. The zone remains in this |
|
89 |
* state until there are no more kernel threads associated with the |
|
90 |
* zone. zone_create(), zone_enter(), and zone_destroy() on this zone will |
|
91 |
* fail. |
|
92 |
* |
|
93 |
* ZONE_IS_DOWN: All kernel threads doing work on behalf of the zone |
|
94 |
* have exited. zone_shutdown() returns. Henceforth it is not possible to |
|
95 |
* join the zone or create kernel threads therein. |
|
96 |
* |
|
97 |
* ZONE_IS_DYING: zone_destroy() has been called on the zone; zone |
|
98 |
* remains in this state until zsched exits. Calls to zone_find_by_*() |
|
99 |
* return NULL from now on. |
|
100 |
* |
|
101 |
* ZONE_IS_DEAD: zsched has exited (zone_ntasks == 0). There are no |
|
102 |
* processes or threads doing work on behalf of the zone. The zone is |
|
103 |
* removed from the list of active zones. zone_destroy() returns, and |
|
104 |
* the zone can be recreated. |
|
105 |
* |
|
106 |
* ZONE_IS_FREE (internal state): zone_ref goes to 0, ZSD destructor |
|
107 |
* callbacks are executed, and all memory associated with the zone is |
|
108 |
* freed. |
|
109 |
* |
|
110 |
* Threads can wait for the zone to enter a requested state by using |
|
111 |
* zone_status_wait() or zone_status_timedwait() with the desired |
|
112 |
* state passed in as an argument. Zone state transitions are |
|
113 |
* uni-directional; it is not possible to move back to an earlier state. |
|
114 |
* |
|
115 |
* |
|
116 |
* Zone-Specific Data: |
|
117 |
* |
|
118 |
* Subsystems needing to maintain zone-specific data can store that |
|
119 |
* data using the ZSD mechanism. This provides a zone-specific data |
|
120 |
* store, similar to thread-specific data (see pthread_getspecific(3C) |
|
121 |
* or the TSD code in uts/common/disp/thread.c. Also, ZSD can be used |
|
122 |
* to register callbacks to be invoked when a zone is created, shut |
|
123 |
* down, or destroyed. This can be used to initialize zone-specific |
|
124 |
* data for new zones and to clean up when zones go away. |
|
125 |
* |
|
126 |
* |
|
127 |
* Data Structures: |
|
128 |
* |
|
129 |
* The per-zone structure (zone_t) is reference counted, and freed |
|
130 |
* when all references are released. zone_hold and zone_rele can be |
|
131 |
* used to adjust the reference count. In addition, reference counts |
|
132 |
* associated with the cred_t structure are tracked separately using |
|
133 |
* zone_cred_hold and zone_cred_rele. |
|
134 |
* |
|
135 |
* Pointers to active zone_t's are stored in two hash tables; one |
|
136 |
* for searching by id, the other for searching by name. Lookups |
|
137 |
* can be performed on either basis, using zone_find_by_id and |
|
138 |
* zone_find_by_name. Both return zone_t pointers with the zone |
|
139 |
* held, so zone_rele should be called when the pointer is no longer |
|
140 |
* needed. Zones can also be searched by path; zone_find_by_path |
|
141 |
* returns the zone with which a path name is associated (global |
|
142 |
* zone if the path is not within some other zone's file system |
|
143 |
* hierarchy). This currently requires iterating through each zone, |
|
144 |
* so it is slower than an id or name search via a hash table. |
|
145 |
* |
|
146 |
* |
|
147 |
* Locking: |
|
148 |
* |
|
149 |
* zonehash_lock: This is a top-level global lock used to protect the |
|
150 |
* zone hash tables and lists. Zones cannot be created or destroyed |
|
151 |
* while this lock is held. |
|
152 |
* zone_status_lock: This is a global lock protecting zone state. |
|
153 |
* Zones cannot change state while this lock is held. It also |
|
154 |
* protects the list of kernel threads associated with a zone. |
|
155 |
* zone_lock: This is a per-zone lock used to protect several fields of |
|
156 |
* the zone_t (see <sys/zone.h> for details). In addition, holding |
|
157 |
* this lock means that the zone cannot go away. |
|
3247 | 158 |
* zone_nlwps_lock: This is a per-zone lock used to protect the fields |
159 |
* related to the zone.max-lwps rctl. |
|
160 |
* zone_mem_lock: This is a per-zone lock used to protect the fields |
|
161 |
* related to the zone.max-locked-memory and zone.max-swap rctls. |
|
0 | 162 |
* zsd_key_lock: This is a global lock protecting the key state for ZSD. |
163 |
* zone_deathrow_lock: This is a global lock protecting the "deathrow" |
|
164 |
* list (a list of zones in the ZONE_IS_DEAD state). |
|
165 |
* |
|
166 |
* Ordering requirements: |
|
167 |
* pool_lock --> cpu_lock --> zonehash_lock --> zone_status_lock --> |
|
168 |
* zone_lock --> zsd_key_lock --> pidlock --> p_lock |
|
169 |
* |
|
3247 | 170 |
* When taking zone_mem_lock or zone_nlwps_lock, the lock ordering is: |
171 |
* zonehash_lock --> a_lock --> pidlock --> p_lock --> zone_mem_lock |
|
172 |
* zonehash_lock --> a_lock --> pidlock --> p_lock --> zone_mem_lock |
|
173 |
* |
|
0 | 174 |
* Blocking memory allocations are permitted while holding any of the |
175 |
* zone locks. |
|
176 |
* |
|
177 |
* |
|
178 |
* System Call Interface: |
|
179 |
* |
|
180 |
* The zone subsystem can be managed and queried from user level with |
|
181 |
* the following system calls (all subcodes of the primary "zone" |
|
182 |
* system call): |
|
183 |
* - zone_create: creates a zone with selected attributes (name, |
|
789 | 184 |
* root path, privileges, resource controls, ZFS datasets) |
0 | 185 |
* - zone_enter: allows the current process to enter a zone |
186 |
* - zone_getattr: reports attributes of a zone |
|
2267 | 187 |
* - zone_setattr: set attributes of a zone |
188 |
* - zone_boot: set 'init' running for the zone |
|
0 | 189 |
* - zone_list: lists all zones active in the system |
190 |
* - zone_lookup: looks up zone id based on name |
|
191 |
* - zone_shutdown: initiates shutdown process (see states above) |
|
192 |
* - zone_destroy: completes shutdown process (see states above) |
|
193 |
* |
|
194 |
*/ |
|
195 |
||
196 |
#include <sys/priv_impl.h> |
|
197 |
#include <sys/cred.h> |
|
198 |
#include <c2/audit.h> |
|
199 |
#include <sys/debug.h> |
|
200 |
#include <sys/file.h> |
|
201 |
#include <sys/kmem.h> |
|
3247 | 202 |
#include <sys/kstat.h> |
0 | 203 |
#include <sys/mutex.h> |
1676 | 204 |
#include <sys/note.h> |
0 | 205 |
#include <sys/pathname.h> |
206 |
#include <sys/proc.h> |
|
207 |
#include <sys/project.h> |
|
1166 | 208 |
#include <sys/sysevent.h> |
0 | 209 |
#include <sys/task.h> |
210 |
#include <sys/systm.h> |
|
211 |
#include <sys/types.h> |
|
212 |
#include <sys/utsname.h> |
|
213 |
#include <sys/vnode.h> |
|
214 |
#include <sys/vfs.h> |
|
215 |
#include <sys/systeminfo.h> |
|
216 |
#include <sys/policy.h> |
|
217 |
#include <sys/cred_impl.h> |
|
218 |
#include <sys/contract_impl.h> |
|
219 |
#include <sys/contract/process_impl.h> |
|
220 |
#include <sys/class.h> |
|
221 |
#include <sys/pool.h> |
|
222 |
#include <sys/pool_pset.h> |
|
223 |
#include <sys/pset.h> |
|
224 |
#include <sys/sysmacros.h> |
|
225 |
#include <sys/callb.h> |
|
226 |
#include <sys/vmparam.h> |
|
227 |
#include <sys/corectl.h> |
|
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
228 |
#include <sys/ipc_impl.h> |
12273
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
229 |
#include <sys/klpd.h> |
0 | 230 |
|
231 |
#include <sys/door.h> |
|
232 |
#include <sys/cpuvar.h> |
|
5880 | 233 |
#include <sys/sdt.h> |
0 | 234 |
|
235 |
#include <sys/uadmin.h> |
|
236 |
#include <sys/session.h> |
|
237 |
#include <sys/cmn_err.h> |
|
238 |
#include <sys/modhash.h> |
|
2267 | 239 |
#include <sys/sunddi.h> |
0 | 240 |
#include <sys/nvpair.h> |
241 |
#include <sys/rctl.h> |
|
242 |
#include <sys/fss.h> |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
243 |
#include <sys/brand.h> |
0 | 244 |
#include <sys/zone.h> |
3448 | 245 |
#include <net/if.h> |
3792 | 246 |
#include <sys/cpucaps.h> |
3247 | 247 |
#include <vm/seg.h> |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
248 |
#include <sys/mac.h> |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
249 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
250 |
/* List of data link IDs which are accessible from the zone */ |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
251 |
typedef struct zone_dl { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
252 |
datalink_id_t zdl_id; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
253 |
list_node_t zdl_linkage; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
254 |
} zone_dl_t; |
3247 | 255 |
|
0 | 256 |
/* |
257 |
* cv used to signal that all references to the zone have been released. This |
|
258 |
* needs to be global since there may be multiple waiters, and the first to |
|
259 |
* wake up will free the zone_t, hence we cannot use zone->zone_cv. |
|
260 |
*/ |
|
261 |
static kcondvar_t zone_destroy_cv; |
|
262 |
/* |
|
263 |
* Lock used to serialize access to zone_cv. This could have been per-zone, |
|
264 |
* but then we'd need another lock for zone_destroy_cv, and why bother? |
|
265 |
*/ |
|
266 |
static kmutex_t zone_status_lock; |
|
267 |
||
268 |
/* |
|
269 |
* ZSD-related global variables. |
|
270 |
*/ |
|
271 |
static kmutex_t zsd_key_lock; /* protects the following two */ |
|
272 |
/* |
|
273 |
* The next caller of zone_key_create() will be assigned a key of ++zsd_keyval. |
|
274 |
*/ |
|
275 |
static zone_key_t zsd_keyval = 0; |
|
276 |
/* |
|
277 |
* Global list of registered keys. We use this when a new zone is created. |
|
278 |
*/ |
|
279 |
static list_t zsd_registered_keys; |
|
280 |
||
281 |
int zone_hash_size = 256; |
|
1676 | 282 |
static mod_hash_t *zonehashbyname, *zonehashbyid, *zonehashbylabel; |
0 | 283 |
static kmutex_t zonehash_lock; |
284 |
static uint_t zonecount; |
|
285 |
static id_space_t *zoneid_space; |
|
286 |
||
287 |
/* |
|
288 |
* The global zone (aka zone0) is the all-seeing, all-knowing zone in which the |
|
289 |
* kernel proper runs, and which manages all other zones. |
|
290 |
* |
|
291 |
* Although not declared as static, the variable "zone0" should not be used |
|
292 |
* except for by code that needs to reference the global zone early on in boot, |
|
293 |
* before it is fully initialized. All other consumers should use |
|
294 |
* 'global_zone'. |
|
295 |
*/ |
|
296 |
zone_t zone0; |
|
297 |
zone_t *global_zone = NULL; /* Set when the global zone is initialized */ |
|
298 |
||
299 |
/* |
|
300 |
* List of active zones, protected by zonehash_lock. |
|
301 |
*/ |
|
302 |
static list_t zone_active; |
|
303 |
||
304 |
/* |
|
305 |
* List of destroyed zones that still have outstanding cred references. |
|
306 |
* Used for debugging. Uses a separate lock to avoid lock ordering |
|
307 |
* problems in zone_free. |
|
308 |
*/ |
|
309 |
static list_t zone_deathrow; |
|
310 |
static kmutex_t zone_deathrow_lock; |
|
311 |
||
312 |
/* number of zones is limited by virtual interface limit in IP */ |
|
313 |
uint_t maxzones = 8192; |
|
314 |
||
1166 | 315 |
/* Event channel to sent zone state change notifications */ |
316 |
evchan_t *zone_event_chan; |
|
317 |
||
318 |
/* |
|
319 |
* This table holds the mapping from kernel zone states to |
|
320 |
* states visible in the state notification API. |
|
321 |
* The idea is that we only expose "obvious" states and |
|
322 |
* do not expose states which are just implementation details. |
|
323 |
*/ |
|
324 |
const char *zone_status_table[] = { |
|
325 |
ZONE_EVENT_UNINITIALIZED, /* uninitialized */ |
|
5880 | 326 |
ZONE_EVENT_INITIALIZED, /* initialized */ |
1166 | 327 |
ZONE_EVENT_READY, /* ready */ |
328 |
ZONE_EVENT_READY, /* booting */ |
|
329 |
ZONE_EVENT_RUNNING, /* running */ |
|
330 |
ZONE_EVENT_SHUTTING_DOWN, /* shutting_down */ |
|
331 |
ZONE_EVENT_SHUTTING_DOWN, /* empty */ |
|
332 |
ZONE_EVENT_SHUTTING_DOWN, /* down */ |
|
333 |
ZONE_EVENT_SHUTTING_DOWN, /* dying */ |
|
334 |
ZONE_EVENT_UNINITIALIZED, /* dead */ |
|
335 |
}; |
|
336 |
||
0 | 337 |
/* |
338 |
* This isn't static so lint doesn't complain. |
|
339 |
*/ |
|
340 |
rctl_hndl_t rc_zone_cpu_shares; |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
341 |
rctl_hndl_t rc_zone_locked_mem; |
3247 | 342 |
rctl_hndl_t rc_zone_max_swap; |
3792 | 343 |
rctl_hndl_t rc_zone_cpu_cap; |
0 | 344 |
rctl_hndl_t rc_zone_nlwps; |
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
345 |
rctl_hndl_t rc_zone_shmmax; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
346 |
rctl_hndl_t rc_zone_shmmni; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
347 |
rctl_hndl_t rc_zone_semmni; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
348 |
rctl_hndl_t rc_zone_msgmni; |
0 | 349 |
/* |
350 |
* Synchronization primitives used to synchronize between mounts and zone |
|
351 |
* creation/destruction. |
|
352 |
*/ |
|
353 |
static int mounts_in_progress; |
|
354 |
static kcondvar_t mount_cv; |
|
355 |
static kmutex_t mount_lock; |
|
356 |
||
2267 | 357 |
const char * const zone_default_initname = "/sbin/init"; |
1676 | 358 |
static char * const zone_prefix = "/zone/"; |
0 | 359 |
static int zone_shutdown(zoneid_t zoneid); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
360 |
static int zone_add_datalink(zoneid_t, datalink_id_t); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
361 |
static int zone_remove_datalink(zoneid_t, datalink_id_t); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
362 |
static int zone_list_datalink(zoneid_t, int *, datalink_id_t *); |
0 | 363 |
|
5880 | 364 |
typedef boolean_t zsd_applyfn_t(kmutex_t *, boolean_t, zone_t *, zone_key_t); |
365 |
||
366 |
static void zsd_apply_all_zones(zsd_applyfn_t *, zone_key_t); |
|
367 |
static void zsd_apply_all_keys(zsd_applyfn_t *, zone_t *); |
|
368 |
static boolean_t zsd_apply_create(kmutex_t *, boolean_t, zone_t *, zone_key_t); |
|
369 |
static boolean_t zsd_apply_shutdown(kmutex_t *, boolean_t, zone_t *, |
|
370 |
zone_key_t); |
|
371 |
static boolean_t zsd_apply_destroy(kmutex_t *, boolean_t, zone_t *, zone_key_t); |
|
372 |
static boolean_t zsd_wait_for_creator(zone_t *, struct zsd_entry *, |
|
373 |
kmutex_t *); |
|
374 |
static boolean_t zsd_wait_for_inprogress(zone_t *, struct zsd_entry *, |
|
375 |
kmutex_t *); |
|
376 |
||
0 | 377 |
/* |
813 | 378 |
* Bump this number when you alter the zone syscall interfaces; this is |
379 |
* because we need to have support for previous API versions in libc |
|
380 |
* to support patching; libc calls into the kernel to determine this number. |
|
381 |
* |
|
382 |
* Version 1 of the API is the version originally shipped with Solaris 10 |
|
383 |
* Version 2 alters the zone_create system call in order to support more |
|
384 |
* arguments by moving the args into a structure; and to do better |
|
385 |
* error reporting when zone_create() fails. |
|
386 |
* Version 3 alters the zone_create system call in order to support the |
|
387 |
* import of ZFS datasets to zones. |
|
1676 | 388 |
* Version 4 alters the zone_create system call in order to support |
389 |
* Trusted Extensions. |
|
2267 | 390 |
* Version 5 alters the zone_boot system call, and converts its old |
391 |
* bootargs parameter to be set by the zone_setattr API instead. |
|
3448 | 392 |
* Version 6 adds the flag argument to zone_create. |
813 | 393 |
*/ |
3448 | 394 |
static const int ZONE_SYSCALL_API_VERSION = 6; |
813 | 395 |
|
396 |
/* |
|
0 | 397 |
* Certain filesystems (such as NFS and autofs) need to know which zone |
398 |
* the mount is being placed in. Because of this, we need to be able to |
|
399 |
* ensure that a zone isn't in the process of being created such that |
|
400 |
* nfs_mount() thinks it is in the global zone, while by the time it |
|
401 |
* gets added the list of mounted zones, it ends up on zoneA's mount |
|
402 |
* list. |
|
403 |
* |
|
404 |
* The following functions: block_mounts()/resume_mounts() and |
|
405 |
* mount_in_progress()/mount_completed() are used by zones and the VFS |
|
406 |
* layer (respectively) to synchronize zone creation and new mounts. |
|
407 |
* |
|
408 |
* The semantics are like a reader-reader lock such that there may |
|
409 |
* either be multiple mounts (or zone creations, if that weren't |
|
410 |
* serialized by zonehash_lock) in progress at the same time, but not |
|
411 |
* both. |
|
412 |
* |
|
413 |
* We use cv's so the user can ctrl-C out of the operation if it's |
|
414 |
* taking too long. |
|
415 |
* |
|
416 |
* The semantics are such that there is unfair bias towards the |
|
417 |
* "current" operation. This means that zone creations may starve if |
|
418 |
* there is a rapid succession of new mounts coming in to the system, or |
|
419 |
* there is a remote possibility that zones will be created at such a |
|
420 |
* rate that new mounts will not be able to proceed. |
|
421 |
*/ |
|
422 |
/* |
|
423 |
* Prevent new mounts from progressing to the point of calling |
|
424 |
* VFS_MOUNT(). If there are already mounts in this "region", wait for |
|
425 |
* them to complete. |
|
426 |
*/ |
|
427 |
static int |
|
428 |
block_mounts(void) |
|
429 |
{ |
|
430 |
int retval = 0; |
|
431 |
||
432 |
/* |
|
433 |
* Since it may block for a long time, block_mounts() shouldn't be |
|
434 |
* called with zonehash_lock held. |
|
435 |
*/ |
|
436 |
ASSERT(MUTEX_NOT_HELD(&zonehash_lock)); |
|
437 |
mutex_enter(&mount_lock); |
|
438 |
while (mounts_in_progress > 0) { |
|
439 |
if (cv_wait_sig(&mount_cv, &mount_lock) == 0) |
|
440 |
goto signaled; |
|
441 |
} |
|
442 |
/* |
|
443 |
* A negative value of mounts_in_progress indicates that mounts |
|
444 |
* have been blocked by (-mounts_in_progress) different callers. |
|
445 |
*/ |
|
446 |
mounts_in_progress--; |
|
447 |
retval = 1; |
|
448 |
signaled: |
|
449 |
mutex_exit(&mount_lock); |
|
450 |
return (retval); |
|
451 |
} |
|
452 |
||
453 |
/* |
|
454 |
* The VFS layer may progress with new mounts as far as we're concerned. |
|
455 |
* Allow them to progress if we were the last obstacle. |
|
456 |
*/ |
|
457 |
static void |
|
458 |
resume_mounts(void) |
|
459 |
{ |
|
460 |
mutex_enter(&mount_lock); |
|
461 |
if (++mounts_in_progress == 0) |
|
462 |
cv_broadcast(&mount_cv); |
|
463 |
mutex_exit(&mount_lock); |
|
464 |
} |
|
465 |
||
466 |
/* |
|
467 |
* The VFS layer is busy with a mount; zones should wait until all |
|
468 |
* mounts are completed to progress. |
|
469 |
*/ |
|
470 |
void |
|
471 |
mount_in_progress(void) |
|
472 |
{ |
|
473 |
mutex_enter(&mount_lock); |
|
474 |
while (mounts_in_progress < 0) |
|
475 |
cv_wait(&mount_cv, &mount_lock); |
|
476 |
mounts_in_progress++; |
|
477 |
mutex_exit(&mount_lock); |
|
478 |
} |
|
479 |
||
480 |
/* |
|
481 |
* VFS is done with one mount; wake up any waiting block_mounts() |
|
482 |
* callers if this is the last mount. |
|
483 |
*/ |
|
484 |
void |
|
485 |
mount_completed(void) |
|
486 |
{ |
|
487 |
mutex_enter(&mount_lock); |
|
488 |
if (--mounts_in_progress == 0) |
|
489 |
cv_broadcast(&mount_cv); |
|
490 |
mutex_exit(&mount_lock); |
|
491 |
} |
|
492 |
||
493 |
/* |
|
494 |
* ZSD routines. |
|
495 |
* |
|
496 |
* Zone Specific Data (ZSD) is modeled after Thread Specific Data as |
|
497 |
* defined by the pthread_key_create() and related interfaces. |
|
498 |
* |
|
499 |
* Kernel subsystems may register one or more data items and/or |
|
500 |
* callbacks to be executed when a zone is created, shutdown, or |
|
501 |
* destroyed. |
|
502 |
* |
|
503 |
* Unlike the thread counterpart, destructor callbacks will be executed |
|
504 |
* even if the data pointer is NULL and/or there are no constructor |
|
505 |
* callbacks, so it is the responsibility of such callbacks to check for |
|
506 |
* NULL data values if necessary. |
|
507 |
* |
|
508 |
* The locking strategy and overall picture is as follows: |
|
509 |
* |
|
510 |
* When someone calls zone_key_create(), a template ZSD entry is added to the |
|
5880 | 511 |
* global list "zsd_registered_keys", protected by zsd_key_lock. While |
512 |
* holding that lock all the existing zones are marked as |
|
513 |
* ZSD_CREATE_NEEDED and a copy of the ZSD entry added to the per-zone |
|
514 |
* zone_zsd list (protected by zone_lock). The global list is updated first |
|
515 |
* (under zone_key_lock) to make sure that newly created zones use the |
|
516 |
* most recent list of keys. Then under zonehash_lock we walk the zones |
|
517 |
* and mark them. Similar locking is used in zone_key_delete(). |
|
0 | 518 |
* |
5880 | 519 |
* The actual create, shutdown, and destroy callbacks are done without |
520 |
* holding any lock. And zsd_flags are used to ensure that the operations |
|
521 |
* completed so that when zone_key_create (and zone_create) is done, as well as |
|
522 |
* zone_key_delete (and zone_destroy) is done, all the necessary callbacks |
|
523 |
* are completed. |
|
0 | 524 |
* |
525 |
* When new zones are created constructor callbacks for all registered ZSD |
|
5880 | 526 |
* entries will be called. That also uses the above two phases of marking |
527 |
* what needs to be done, and then running the callbacks without holding |
|
528 |
* any locks. |
|
0 | 529 |
* |
530 |
* The framework does not provide any locking around zone_getspecific() and |
|
531 |
* zone_setspecific() apart from that needed for internal consistency, so |
|
532 |
* callers interested in atomic "test-and-set" semantics will need to provide |
|
533 |
* their own locking. |
|
534 |
*/ |
|
535 |
||
536 |
/* |
|
537 |
* Helper function to find the zsd_entry associated with the key in the |
|
538 |
* given list. |
|
539 |
*/ |
|
540 |
static struct zsd_entry * |
|
541 |
zsd_find(list_t *l, zone_key_t key) |
|
542 |
{ |
|
543 |
struct zsd_entry *zsd; |
|
544 |
||
545 |
for (zsd = list_head(l); zsd != NULL; zsd = list_next(l, zsd)) { |
|
546 |
if (zsd->zsd_key == key) { |
|
5880 | 547 |
return (zsd); |
548 |
} |
|
549 |
} |
|
550 |
return (NULL); |
|
551 |
} |
|
552 |
||
553 |
/* |
|
554 |
* Helper function to find the zsd_entry associated with the key in the |
|
555 |
* given list. Move it to the front of the list. |
|
556 |
*/ |
|
557 |
static struct zsd_entry * |
|
558 |
zsd_find_mru(list_t *l, zone_key_t key) |
|
559 |
{ |
|
560 |
struct zsd_entry *zsd; |
|
561 |
||
562 |
for (zsd = list_head(l); zsd != NULL; zsd = list_next(l, zsd)) { |
|
563 |
if (zsd->zsd_key == key) { |
|
0 | 564 |
/* |
565 |
* Move to head of list to keep list in MRU order. |
|
566 |
*/ |
|
567 |
if (zsd != list_head(l)) { |
|
568 |
list_remove(l, zsd); |
|
569 |
list_insert_head(l, zsd); |
|
570 |
} |
|
571 |
return (zsd); |
|
572 |
} |
|
573 |
} |
|
574 |
return (NULL); |
|
575 |
} |
|
576 |
||
5880 | 577 |
void |
578 |
zone_key_create(zone_key_t *keyp, void *(*create)(zoneid_t), |
|
579 |
void (*shutdown)(zoneid_t, void *), void (*destroy)(zoneid_t, void *)) |
|
580 |
{ |
|
581 |
struct zsd_entry *zsdp; |
|
582 |
struct zsd_entry *t; |
|
583 |
struct zone *zone; |
|
584 |
zone_key_t key; |
|
585 |
||
586 |
zsdp = kmem_zalloc(sizeof (*zsdp), KM_SLEEP); |
|
587 |
zsdp->zsd_data = NULL; |
|
588 |
zsdp->zsd_create = create; |
|
589 |
zsdp->zsd_shutdown = shutdown; |
|
590 |
zsdp->zsd_destroy = destroy; |
|
591 |
||
592 |
/* |
|
593 |
* Insert in global list of callbacks. Makes future zone creations |
|
594 |
* see it. |
|
595 |
*/ |
|
596 |
mutex_enter(&zsd_key_lock); |
|
10865
ff55368ffe7b
6747527 BAD TRAP: type=31 when accounting is turned ON for zones
Pramod Batni <Pramod.Batni@Sun.COM>
parents:
10616
diff
changeset
|
597 |
key = zsdp->zsd_key = ++zsd_keyval; |
5880 | 598 |
ASSERT(zsd_keyval != 0); |
599 |
list_insert_tail(&zsd_registered_keys, zsdp); |
|
600 |
mutex_exit(&zsd_key_lock); |
|
601 |
||
602 |
/* |
|
603 |
* Insert for all existing zones and mark them as needing |
|
604 |
* a create callback. |
|
605 |
*/ |
|
606 |
mutex_enter(&zonehash_lock); /* stop the world */ |
|
607 |
for (zone = list_head(&zone_active); zone != NULL; |
|
608 |
zone = list_next(&zone_active, zone)) { |
|
609 |
zone_status_t status; |
|
610 |
||
611 |
mutex_enter(&zone->zone_lock); |
|
612 |
||
613 |
/* Skip zones that are on the way down or not yet up */ |
|
614 |
status = zone_status_get(zone); |
|
615 |
if (status >= ZONE_IS_DOWN || |
|
616 |
status == ZONE_IS_UNINITIALIZED) { |
|
617 |
mutex_exit(&zone->zone_lock); |
|
618 |
continue; |
|
619 |
} |
|
620 |
||
621 |
t = zsd_find_mru(&zone->zone_zsd, key); |
|
622 |
if (t != NULL) { |
|
623 |
/* |
|
624 |
* A zsd_configure already inserted it after |
|
625 |
* we dropped zsd_key_lock above. |
|
626 |
*/ |
|
627 |
mutex_exit(&zone->zone_lock); |
|
628 |
continue; |
|
629 |
} |
|
630 |
t = kmem_zalloc(sizeof (*t), KM_SLEEP); |
|
631 |
t->zsd_key = key; |
|
632 |
t->zsd_create = create; |
|
633 |
t->zsd_shutdown = shutdown; |
|
634 |
t->zsd_destroy = destroy; |
|
635 |
if (create != NULL) { |
|
636 |
t->zsd_flags = ZSD_CREATE_NEEDED; |
|
637 |
DTRACE_PROBE2(zsd__create__needed, |
|
638 |
zone_t *, zone, zone_key_t, key); |
|
639 |
} |
|
640 |
list_insert_tail(&zone->zone_zsd, t); |
|
641 |
mutex_exit(&zone->zone_lock); |
|
642 |
} |
|
643 |
mutex_exit(&zonehash_lock); |
|
644 |
||
645 |
if (create != NULL) { |
|
646 |
/* Now call the create callback for this key */ |
|
647 |
zsd_apply_all_zones(zsd_apply_create, key); |
|
648 |
} |
|
10865
ff55368ffe7b
6747527 BAD TRAP: type=31 when accounting is turned ON for zones
Pramod Batni <Pramod.Batni@Sun.COM>
parents:
10616
diff
changeset
|
649 |
/* |
10910
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
650 |
* It is safe for consumers to use the key now, make it |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
651 |
* globally visible. Specifically zone_getspecific() will |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
652 |
* always successfully return the zone specific data associated |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
653 |
* with the key. |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
654 |
*/ |
10865
ff55368ffe7b
6747527 BAD TRAP: type=31 when accounting is turned ON for zones
Pramod Batni <Pramod.Batni@Sun.COM>
parents:
10616
diff
changeset
|
655 |
*keyp = key; |
ff55368ffe7b
6747527 BAD TRAP: type=31 when accounting is turned ON for zones
Pramod Batni <Pramod.Batni@Sun.COM>
parents:
10616
diff
changeset
|
656 |
|
5880 | 657 |
} |
658 |
||
0 | 659 |
/* |
660 |
* Function called when a module is being unloaded, or otherwise wishes |
|
661 |
* to unregister its ZSD key and callbacks. |
|
5880 | 662 |
* |
663 |
* Remove from the global list and determine the functions that need to |
|
664 |
* be called under a global lock. Then call the functions without |
|
665 |
* holding any locks. Finally free up the zone_zsd entries. (The apply |
|
666 |
* functions need to access the zone_zsd entries to find zsd_data etc.) |
|
0 | 667 |
*/ |
668 |
int |
|
669 |
zone_key_delete(zone_key_t key) |
|
670 |
{ |
|
671 |
struct zsd_entry *zsdp = NULL; |
|
672 |
zone_t *zone; |
|
673 |
||
674 |
mutex_enter(&zsd_key_lock); |
|
5880 | 675 |
zsdp = zsd_find_mru(&zsd_registered_keys, key); |
676 |
if (zsdp == NULL) { |
|
677 |
mutex_exit(&zsd_key_lock); |
|
678 |
return (-1); |
|
679 |
} |
|
0 | 680 |
list_remove(&zsd_registered_keys, zsdp); |
681 |
mutex_exit(&zsd_key_lock); |
|
682 |
||
5880 | 683 |
mutex_enter(&zonehash_lock); |
0 | 684 |
for (zone = list_head(&zone_active); zone != NULL; |
685 |
zone = list_next(&zone_active, zone)) { |
|
686 |
struct zsd_entry *del; |
|
5880 | 687 |
|
688 |
mutex_enter(&zone->zone_lock); |
|
689 |
del = zsd_find_mru(&zone->zone_zsd, key); |
|
690 |
if (del == NULL) { |
|
691 |
/* |
|
692 |
* Somebody else got here first e.g the zone going |
|
693 |
* away. |
|
694 |
*/ |
|
695 |
mutex_exit(&zone->zone_lock); |
|
696 |
continue; |
|
697 |
} |
|
698 |
ASSERT(del->zsd_shutdown == zsdp->zsd_shutdown); |
|
699 |
ASSERT(del->zsd_destroy == zsdp->zsd_destroy); |
|
700 |
if (del->zsd_shutdown != NULL && |
|
701 |
(del->zsd_flags & ZSD_SHUTDOWN_ALL) == 0) { |
|
702 |
del->zsd_flags |= ZSD_SHUTDOWN_NEEDED; |
|
703 |
DTRACE_PROBE2(zsd__shutdown__needed, |
|
704 |
zone_t *, zone, zone_key_t, key); |
|
705 |
} |
|
706 |
if (del->zsd_destroy != NULL && |
|
707 |
(del->zsd_flags & ZSD_DESTROY_ALL) == 0) { |
|
708 |
del->zsd_flags |= ZSD_DESTROY_NEEDED; |
|
709 |
DTRACE_PROBE2(zsd__destroy__needed, |
|
710 |
zone_t *, zone, zone_key_t, key); |
|
0 | 711 |
} |
712 |
mutex_exit(&zone->zone_lock); |
|
713 |
} |
|
714 |
mutex_exit(&zonehash_lock); |
|
715 |
kmem_free(zsdp, sizeof (*zsdp)); |
|
5880 | 716 |
|
717 |
/* Now call the shutdown and destroy callback for this key */ |
|
718 |
zsd_apply_all_zones(zsd_apply_shutdown, key); |
|
719 |
zsd_apply_all_zones(zsd_apply_destroy, key); |
|
720 |
||
721 |
/* Now we can free up the zsdp structures in each zone */ |
|
722 |
mutex_enter(&zonehash_lock); |
|
0 | 723 |
for (zone = list_head(&zone_active); zone != NULL; |
5880 | 724 |
zone = list_next(&zone_active, zone)) { |
725 |
struct zsd_entry *del; |
|
726 |
||
727 |
mutex_enter(&zone->zone_lock); |
|
728 |
del = zsd_find(&zone->zone_zsd, key); |
|
729 |
if (del != NULL) { |
|
730 |
list_remove(&zone->zone_zsd, del); |
|
731 |
ASSERT(!(del->zsd_flags & ZSD_ALL_INPROGRESS)); |
|
732 |
kmem_free(del, sizeof (*del)); |
|
733 |
} |
|
0 | 734 |
mutex_exit(&zone->zone_lock); |
5880 | 735 |
} |
0 | 736 |
mutex_exit(&zonehash_lock); |
5880 | 737 |
|
738 |
return (0); |
|
0 | 739 |
} |
740 |
||
741 |
/* |
|
742 |
* ZSD counterpart of pthread_setspecific(). |
|
5880 | 743 |
* |
744 |
* Since all zsd callbacks, including those with no create function, |
|
745 |
* have an entry in zone_zsd, if the key is registered it is part of |
|
746 |
* the zone_zsd list. |
|
747 |
* Return an error if the key wasn't registerd. |
|
0 | 748 |
*/ |
749 |
int |
|
750 |
zone_setspecific(zone_key_t key, zone_t *zone, const void *data) |
|
751 |
{ |
|
752 |
struct zsd_entry *t; |
|
753 |
||
754 |
mutex_enter(&zone->zone_lock); |
|
5880 | 755 |
t = zsd_find_mru(&zone->zone_zsd, key); |
0 | 756 |
if (t != NULL) { |
757 |
/* |
|
758 |
* Replace old value with new |
|
759 |
*/ |
|
760 |
t->zsd_data = (void *)data; |
|
761 |
mutex_exit(&zone->zone_lock); |
|
762 |
return (0); |
|
763 |
} |
|
764 |
mutex_exit(&zone->zone_lock); |
|
5880 | 765 |
return (-1); |
0 | 766 |
} |
767 |
||
768 |
/* |
|
769 |
* ZSD counterpart of pthread_getspecific(). |
|
770 |
*/ |
|
771 |
void * |
|
772 |
zone_getspecific(zone_key_t key, zone_t *zone) |
|
773 |
{ |
|
774 |
struct zsd_entry *t; |
|
775 |
void *data; |
|
776 |
||
777 |
mutex_enter(&zone->zone_lock); |
|
5880 | 778 |
t = zsd_find_mru(&zone->zone_zsd, key); |
0 | 779 |
data = (t == NULL ? NULL : t->zsd_data); |
780 |
mutex_exit(&zone->zone_lock); |
|
781 |
return (data); |
|
782 |
} |
|
783 |
||
784 |
/* |
|
785 |
* Function used to initialize a zone's list of ZSD callbacks and data |
|
786 |
* when the zone is being created. The callbacks are initialized from |
|
5880 | 787 |
* the template list (zsd_registered_keys). The constructor callback is |
788 |
* executed later (once the zone exists and with locks dropped). |
|
0 | 789 |
*/ |
790 |
static void |
|
791 |
zone_zsd_configure(zone_t *zone) |
|
792 |
{ |
|
793 |
struct zsd_entry *zsdp; |
|
794 |
struct zsd_entry *t; |
|
795 |
||
796 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
797 |
ASSERT(list_head(&zone->zone_zsd) == NULL); |
|
5880 | 798 |
mutex_enter(&zone->zone_lock); |
0 | 799 |
mutex_enter(&zsd_key_lock); |
800 |
for (zsdp = list_head(&zsd_registered_keys); zsdp != NULL; |
|
801 |
zsdp = list_next(&zsd_registered_keys, zsdp)) { |
|
5880 | 802 |
/* |
803 |
* Since this zone is ZONE_IS_UNCONFIGURED, zone_key_create |
|
804 |
* should not have added anything to it. |
|
805 |
*/ |
|
806 |
ASSERT(zsd_find(&zone->zone_zsd, zsdp->zsd_key) == NULL); |
|
807 |
||
808 |
t = kmem_zalloc(sizeof (*t), KM_SLEEP); |
|
809 |
t->zsd_key = zsdp->zsd_key; |
|
810 |
t->zsd_create = zsdp->zsd_create; |
|
811 |
t->zsd_shutdown = zsdp->zsd_shutdown; |
|
812 |
t->zsd_destroy = zsdp->zsd_destroy; |
|
0 | 813 |
if (zsdp->zsd_create != NULL) { |
5880 | 814 |
t->zsd_flags = ZSD_CREATE_NEEDED; |
815 |
DTRACE_PROBE2(zsd__create__needed, |
|
816 |
zone_t *, zone, zone_key_t, zsdp->zsd_key); |
|
0 | 817 |
} |
5880 | 818 |
list_insert_tail(&zone->zone_zsd, t); |
0 | 819 |
} |
820 |
mutex_exit(&zsd_key_lock); |
|
5880 | 821 |
mutex_exit(&zone->zone_lock); |
0 | 822 |
} |
823 |
||
824 |
enum zsd_callback_type { ZSD_CREATE, ZSD_SHUTDOWN, ZSD_DESTROY }; |
|
825 |
||
826 |
/* |
|
827 |
* Helper function to execute shutdown or destructor callbacks. |
|
828 |
*/ |
|
829 |
static void |
|
830 |
zone_zsd_callbacks(zone_t *zone, enum zsd_callback_type ct) |
|
831 |
{ |
|
832 |
struct zsd_entry *t; |
|
833 |
||
834 |
ASSERT(ct == ZSD_SHUTDOWN || ct == ZSD_DESTROY); |
|
835 |
ASSERT(ct != ZSD_SHUTDOWN || zone_status_get(zone) >= ZONE_IS_EMPTY); |
|
836 |
ASSERT(ct != ZSD_DESTROY || zone_status_get(zone) >= ZONE_IS_DOWN); |
|
837 |
||
5880 | 838 |
/* |
839 |
* Run the callback solely based on what is registered for the zone |
|
840 |
* in zone_zsd. The global list can change independently of this |
|
841 |
* as keys are registered and unregistered and we don't register new |
|
842 |
* callbacks for a zone that is in the process of going away. |
|
843 |
*/ |
|
0 | 844 |
mutex_enter(&zone->zone_lock); |
5880 | 845 |
for (t = list_head(&zone->zone_zsd); t != NULL; |
846 |
t = list_next(&zone->zone_zsd, t)) { |
|
847 |
zone_key_t key = t->zsd_key; |
|
0 | 848 |
|
849 |
/* Skip if no callbacks registered */ |
|
5880 | 850 |
|
851 |
if (ct == ZSD_SHUTDOWN) { |
|
852 |
if (t->zsd_shutdown != NULL && |
|
853 |
(t->zsd_flags & ZSD_SHUTDOWN_ALL) == 0) { |
|
854 |
t->zsd_flags |= ZSD_SHUTDOWN_NEEDED; |
|
855 |
DTRACE_PROBE2(zsd__shutdown__needed, |
|
856 |
zone_t *, zone, zone_key_t, key); |
|
0 | 857 |
} |
858 |
} else { |
|
5880 | 859 |
if (t->zsd_destroy != NULL && |
860 |
(t->zsd_flags & ZSD_DESTROY_ALL) == 0) { |
|
861 |
t->zsd_flags |= ZSD_DESTROY_NEEDED; |
|
862 |
DTRACE_PROBE2(zsd__destroy__needed, |
|
863 |
zone_t *, zone, zone_key_t, key); |
|
0 | 864 |
} |
865 |
} |
|
866 |
} |
|
5880 | 867 |
mutex_exit(&zone->zone_lock); |
868 |
||
869 |
/* Now call the shutdown and destroy callback for this key */ |
|
870 |
zsd_apply_all_keys(zsd_apply_shutdown, zone); |
|
871 |
zsd_apply_all_keys(zsd_apply_destroy, zone); |
|
872 |
||
0 | 873 |
} |
874 |
||
875 |
/* |
|
876 |
* Called when the zone is going away; free ZSD-related memory, and |
|
877 |
* destroy the zone_zsd list. |
|
878 |
*/ |
|
879 |
static void |
|
880 |
zone_free_zsd(zone_t *zone) |
|
881 |
{ |
|
882 |
struct zsd_entry *t, *next; |
|
883 |
||
884 |
/* |
|
885 |
* Free all the zsd_entry's we had on this zone. |
|
886 |
*/ |
|
5880 | 887 |
mutex_enter(&zone->zone_lock); |
0 | 888 |
for (t = list_head(&zone->zone_zsd); t != NULL; t = next) { |
889 |
next = list_next(&zone->zone_zsd, t); |
|
890 |
list_remove(&zone->zone_zsd, t); |
|
5880 | 891 |
ASSERT(!(t->zsd_flags & ZSD_ALL_INPROGRESS)); |
0 | 892 |
kmem_free(t, sizeof (*t)); |
893 |
} |
|
894 |
list_destroy(&zone->zone_zsd); |
|
5880 | 895 |
mutex_exit(&zone->zone_lock); |
896 |
||
897 |
} |
|
898 |
||
899 |
/* |
|
900 |
* Apply a function to all zones for particular key value. |
|
901 |
* |
|
902 |
* The applyfn has to drop zonehash_lock if it does some work, and |
|
903 |
* then reacquire it before it returns. |
|
904 |
* When the lock is dropped we don't follow list_next even |
|
905 |
* if it is possible to do so without any hazards. This is |
|
906 |
* because we want the design to allow for the list of zones |
|
907 |
* to change in any arbitrary way during the time the |
|
908 |
* lock was dropped. |
|
909 |
* |
|
910 |
* It is safe to restart the loop at list_head since the applyfn |
|
911 |
* changes the zsd_flags as it does work, so a subsequent |
|
912 |
* pass through will have no effect in applyfn, hence the loop will terminate |
|
913 |
* in at worst O(N^2). |
|
914 |
*/ |
|
915 |
static void |
|
916 |
zsd_apply_all_zones(zsd_applyfn_t *applyfn, zone_key_t key) |
|
917 |
{ |
|
918 |
zone_t *zone; |
|
919 |
||
920 |
mutex_enter(&zonehash_lock); |
|
921 |
zone = list_head(&zone_active); |
|
922 |
while (zone != NULL) { |
|
923 |
if ((applyfn)(&zonehash_lock, B_FALSE, zone, key)) { |
|
924 |
/* Lock dropped - restart at head */ |
|
925 |
zone = list_head(&zone_active); |
|
926 |
} else { |
|
927 |
zone = list_next(&zone_active, zone); |
|
928 |
} |
|
929 |
} |
|
930 |
mutex_exit(&zonehash_lock); |
|
931 |
} |
|
932 |
||
933 |
/* |
|
934 |
* Apply a function to all keys for a particular zone. |
|
935 |
* |
|
936 |
* The applyfn has to drop zonehash_lock if it does some work, and |
|
937 |
* then reacquire it before it returns. |
|
938 |
* When the lock is dropped we don't follow list_next even |
|
939 |
* if it is possible to do so without any hazards. This is |
|
940 |
* because we want the design to allow for the list of zsd callbacks |
|
941 |
* to change in any arbitrary way during the time the |
|
942 |
* lock was dropped. |
|
943 |
* |
|
944 |
* It is safe to restart the loop at list_head since the applyfn |
|
945 |
* changes the zsd_flags as it does work, so a subsequent |
|
946 |
* pass through will have no effect in applyfn, hence the loop will terminate |
|
947 |
* in at worst O(N^2). |
|
948 |
*/ |
|
949 |
static void |
|
950 |
zsd_apply_all_keys(zsd_applyfn_t *applyfn, zone_t *zone) |
|
951 |
{ |
|
952 |
struct zsd_entry *t; |
|
953 |
||
954 |
mutex_enter(&zone->zone_lock); |
|
955 |
t = list_head(&zone->zone_zsd); |
|
956 |
while (t != NULL) { |
|
957 |
if ((applyfn)(NULL, B_TRUE, zone, t->zsd_key)) { |
|
958 |
/* Lock dropped - restart at head */ |
|
959 |
t = list_head(&zone->zone_zsd); |
|
960 |
} else { |
|
961 |
t = list_next(&zone->zone_zsd, t); |
|
962 |
} |
|
963 |
} |
|
964 |
mutex_exit(&zone->zone_lock); |
|
965 |
} |
|
966 |
||
967 |
/* |
|
968 |
* Call the create function for the zone and key if CREATE_NEEDED |
|
969 |
* is set. |
|
970 |
* If some other thread gets here first and sets CREATE_INPROGRESS, then |
|
971 |
* we wait for that thread to complete so that we can ensure that |
|
972 |
* all the callbacks are done when we've looped over all zones/keys. |
|
973 |
* |
|
974 |
* When we call the create function, we drop the global held by the |
|
975 |
* caller, and return true to tell the caller it needs to re-evalute the |
|
976 |
* state. |
|
977 |
* If the caller holds zone_lock then zone_lock_held is set, and zone_lock |
|
978 |
* remains held on exit. |
|
979 |
*/ |
|
980 |
static boolean_t |
|
981 |
zsd_apply_create(kmutex_t *lockp, boolean_t zone_lock_held, |
|
982 |
zone_t *zone, zone_key_t key) |
|
983 |
{ |
|
984 |
void *result; |
|
985 |
struct zsd_entry *t; |
|
986 |
boolean_t dropped; |
|
987 |
||
988 |
if (lockp != NULL) { |
|
989 |
ASSERT(MUTEX_HELD(lockp)); |
|
990 |
} |
|
991 |
if (zone_lock_held) { |
|
992 |
ASSERT(MUTEX_HELD(&zone->zone_lock)); |
|
993 |
} else { |
|
994 |
mutex_enter(&zone->zone_lock); |
|
995 |
} |
|
996 |
||
997 |
t = zsd_find(&zone->zone_zsd, key); |
|
998 |
if (t == NULL) { |
|
999 |
/* |
|
1000 |
* Somebody else got here first e.g the zone going |
|
1001 |
* away. |
|
1002 |
*/ |
|
1003 |
if (!zone_lock_held) |
|
1004 |
mutex_exit(&zone->zone_lock); |
|
1005 |
return (B_FALSE); |
|
1006 |
} |
|
1007 |
dropped = B_FALSE; |
|
1008 |
if (zsd_wait_for_inprogress(zone, t, lockp)) |
|
1009 |
dropped = B_TRUE; |
|
1010 |
||
1011 |
if (t->zsd_flags & ZSD_CREATE_NEEDED) { |
|
1012 |
t->zsd_flags &= ~ZSD_CREATE_NEEDED; |
|
1013 |
t->zsd_flags |= ZSD_CREATE_INPROGRESS; |
|
1014 |
DTRACE_PROBE2(zsd__create__inprogress, |
|
1015 |
zone_t *, zone, zone_key_t, key); |
|
1016 |
mutex_exit(&zone->zone_lock); |
|
1017 |
if (lockp != NULL) |
|
1018 |
mutex_exit(lockp); |
|
1019 |
||
1020 |
dropped = B_TRUE; |
|
1021 |
ASSERT(t->zsd_create != NULL); |
|
1022 |
DTRACE_PROBE2(zsd__create__start, |
|
1023 |
zone_t *, zone, zone_key_t, key); |
|
1024 |
||
1025 |
result = (*t->zsd_create)(zone->zone_id); |
|
1026 |
||
1027 |
DTRACE_PROBE2(zsd__create__end, |
|
1028 |
zone_t *, zone, voidn *, result); |
|
1029 |
||
1030 |
ASSERT(result != NULL); |
|
1031 |
if (lockp != NULL) |
|
1032 |
mutex_enter(lockp); |
|
1033 |
mutex_enter(&zone->zone_lock); |
|
1034 |
t->zsd_data = result; |
|
1035 |
t->zsd_flags &= ~ZSD_CREATE_INPROGRESS; |
|
1036 |
t->zsd_flags |= ZSD_CREATE_COMPLETED; |
|
1037 |
cv_broadcast(&t->zsd_cv); |
|
1038 |
DTRACE_PROBE2(zsd__create__completed, |
|
1039 |
zone_t *, zone, zone_key_t, key); |
|
1040 |
} |
|
1041 |
if (!zone_lock_held) |
|
1042 |
mutex_exit(&zone->zone_lock); |
|
1043 |
return (dropped); |
|
1044 |
} |
|
1045 |
||
1046 |
/* |
|
1047 |
* Call the shutdown function for the zone and key if SHUTDOWN_NEEDED |
|
1048 |
* is set. |
|
1049 |
* If some other thread gets here first and sets *_INPROGRESS, then |
|
1050 |
* we wait for that thread to complete so that we can ensure that |
|
1051 |
* all the callbacks are done when we've looped over all zones/keys. |
|
1052 |
* |
|
1053 |
* When we call the shutdown function, we drop the global held by the |
|
1054 |
* caller, and return true to tell the caller it needs to re-evalute the |
|
1055 |
* state. |
|
1056 |
* If the caller holds zone_lock then zone_lock_held is set, and zone_lock |
|
1057 |
* remains held on exit. |
|
1058 |
*/ |
|
1059 |
static boolean_t |
|
1060 |
zsd_apply_shutdown(kmutex_t *lockp, boolean_t zone_lock_held, |
|
1061 |
zone_t *zone, zone_key_t key) |
|
1062 |
{ |
|
1063 |
struct zsd_entry *t; |
|
1064 |
void *data; |
|
1065 |
boolean_t dropped; |
|
1066 |
||
1067 |
if (lockp != NULL) { |
|
1068 |
ASSERT(MUTEX_HELD(lockp)); |
|
1069 |
} |
|
1070 |
if (zone_lock_held) { |
|
1071 |
ASSERT(MUTEX_HELD(&zone->zone_lock)); |
|
1072 |
} else { |
|
1073 |
mutex_enter(&zone->zone_lock); |
|
1074 |
} |
|
1075 |
||
1076 |
t = zsd_find(&zone->zone_zsd, key); |
|
1077 |
if (t == NULL) { |
|
1078 |
/* |
|
1079 |
* Somebody else got here first e.g the zone going |
|
1080 |
* away. |
|
1081 |
*/ |
|
1082 |
if (!zone_lock_held) |
|
1083 |
mutex_exit(&zone->zone_lock); |
|
1084 |
return (B_FALSE); |
|
1085 |
} |
|
1086 |
dropped = B_FALSE; |
|
1087 |
if (zsd_wait_for_creator(zone, t, lockp)) |
|
1088 |
dropped = B_TRUE; |
|
1089 |
||
1090 |
if (zsd_wait_for_inprogress(zone, t, lockp)) |
|
1091 |
dropped = B_TRUE; |
|
1092 |
||
1093 |
if (t->zsd_flags & ZSD_SHUTDOWN_NEEDED) { |
|
1094 |
t->zsd_flags &= ~ZSD_SHUTDOWN_NEEDED; |
|
1095 |
t->zsd_flags |= ZSD_SHUTDOWN_INPROGRESS; |
|
1096 |
DTRACE_PROBE2(zsd__shutdown__inprogress, |
|
1097 |
zone_t *, zone, zone_key_t, key); |
|
1098 |
mutex_exit(&zone->zone_lock); |
|
1099 |
if (lockp != NULL) |
|
1100 |
mutex_exit(lockp); |
|
1101 |
dropped = B_TRUE; |
|
1102 |
||
1103 |
ASSERT(t->zsd_shutdown != NULL); |
|
1104 |
data = t->zsd_data; |
|
1105 |
||
1106 |
DTRACE_PROBE2(zsd__shutdown__start, |
|
1107 |
zone_t *, zone, zone_key_t, key); |
|
1108 |
||
1109 |
(t->zsd_shutdown)(zone->zone_id, data); |
|
1110 |
DTRACE_PROBE2(zsd__shutdown__end, |
|
1111 |
zone_t *, zone, zone_key_t, key); |
|
1112 |
||
1113 |
if (lockp != NULL) |
|
1114 |
mutex_enter(lockp); |
|
1115 |
mutex_enter(&zone->zone_lock); |
|
1116 |
t->zsd_flags &= ~ZSD_SHUTDOWN_INPROGRESS; |
|
1117 |
t->zsd_flags |= ZSD_SHUTDOWN_COMPLETED; |
|
1118 |
cv_broadcast(&t->zsd_cv); |
|
1119 |
DTRACE_PROBE2(zsd__shutdown__completed, |
|
1120 |
zone_t *, zone, zone_key_t, key); |
|
1121 |
} |
|
1122 |
if (!zone_lock_held) |
|
1123 |
mutex_exit(&zone->zone_lock); |
|
1124 |
return (dropped); |
|
1125 |
} |
|
1126 |
||
1127 |
/* |
|
1128 |
* Call the destroy function for the zone and key if DESTROY_NEEDED |
|
1129 |
* is set. |
|
1130 |
* If some other thread gets here first and sets *_INPROGRESS, then |
|
1131 |
* we wait for that thread to complete so that we can ensure that |
|
1132 |
* all the callbacks are done when we've looped over all zones/keys. |
|
1133 |
* |
|
1134 |
* When we call the destroy function, we drop the global held by the |
|
1135 |
* caller, and return true to tell the caller it needs to re-evalute the |
|
1136 |
* state. |
|
1137 |
* If the caller holds zone_lock then zone_lock_held is set, and zone_lock |
|
1138 |
* remains held on exit. |
|
1139 |
*/ |
|
1140 |
static boolean_t |
|
1141 |
zsd_apply_destroy(kmutex_t *lockp, boolean_t zone_lock_held, |
|
1142 |
zone_t *zone, zone_key_t key) |
|
1143 |
{ |
|
1144 |
struct zsd_entry *t; |
|
1145 |
void *data; |
|
1146 |
boolean_t dropped; |
|
1147 |
||
1148 |
if (lockp != NULL) { |
|
1149 |
ASSERT(MUTEX_HELD(lockp)); |
|
1150 |
} |
|
1151 |
if (zone_lock_held) { |
|
1152 |
ASSERT(MUTEX_HELD(&zone->zone_lock)); |
|
1153 |
} else { |
|
1154 |
mutex_enter(&zone->zone_lock); |
|
1155 |
} |
|
1156 |
||
1157 |
t = zsd_find(&zone->zone_zsd, key); |
|
1158 |
if (t == NULL) { |
|
1159 |
/* |
|
1160 |
* Somebody else got here first e.g the zone going |
|
1161 |
* away. |
|
1162 |
*/ |
|
1163 |
if (!zone_lock_held) |
|
1164 |
mutex_exit(&zone->zone_lock); |
|
1165 |
return (B_FALSE); |
|
1166 |
} |
|
1167 |
dropped = B_FALSE; |
|
1168 |
if (zsd_wait_for_creator(zone, t, lockp)) |
|
1169 |
dropped = B_TRUE; |
|
1170 |
||
1171 |
if (zsd_wait_for_inprogress(zone, t, lockp)) |
|
1172 |
dropped = B_TRUE; |
|
1173 |
||
1174 |
if (t->zsd_flags & ZSD_DESTROY_NEEDED) { |
|
1175 |
t->zsd_flags &= ~ZSD_DESTROY_NEEDED; |
|
1176 |
t->zsd_flags |= ZSD_DESTROY_INPROGRESS; |
|
1177 |
DTRACE_PROBE2(zsd__destroy__inprogress, |
|
1178 |
zone_t *, zone, zone_key_t, key); |
|
1179 |
mutex_exit(&zone->zone_lock); |
|
1180 |
if (lockp != NULL) |
|
1181 |
mutex_exit(lockp); |
|
1182 |
dropped = B_TRUE; |
|
1183 |
||
1184 |
ASSERT(t->zsd_destroy != NULL); |
|
1185 |
data = t->zsd_data; |
|
1186 |
DTRACE_PROBE2(zsd__destroy__start, |
|
1187 |
zone_t *, zone, zone_key_t, key); |
|
1188 |
||
1189 |
(t->zsd_destroy)(zone->zone_id, data); |
|
1190 |
DTRACE_PROBE2(zsd__destroy__end, |
|
1191 |
zone_t *, zone, zone_key_t, key); |
|
1192 |
||
1193 |
if (lockp != NULL) |
|
1194 |
mutex_enter(lockp); |
|
1195 |
mutex_enter(&zone->zone_lock); |
|
1196 |
t->zsd_data = NULL; |
|
1197 |
t->zsd_flags &= ~ZSD_DESTROY_INPROGRESS; |
|
1198 |
t->zsd_flags |= ZSD_DESTROY_COMPLETED; |
|
1199 |
cv_broadcast(&t->zsd_cv); |
|
1200 |
DTRACE_PROBE2(zsd__destroy__completed, |
|
1201 |
zone_t *, zone, zone_key_t, key); |
|
1202 |
} |
|
1203 |
if (!zone_lock_held) |
|
1204 |
mutex_exit(&zone->zone_lock); |
|
1205 |
return (dropped); |
|
1206 |
} |
|
1207 |
||
1208 |
/* |
|
1209 |
* Wait for any CREATE_NEEDED flag to be cleared. |
|
1210 |
* Returns true if lockp was temporarily dropped while waiting. |
|
1211 |
*/ |
|
1212 |
static boolean_t |
|
1213 |
zsd_wait_for_creator(zone_t *zone, struct zsd_entry *t, kmutex_t *lockp) |
|
1214 |
{ |
|
1215 |
boolean_t dropped = B_FALSE; |
|
1216 |
||
1217 |
while (t->zsd_flags & ZSD_CREATE_NEEDED) { |
|
1218 |
DTRACE_PROBE2(zsd__wait__for__creator, |
|
1219 |
zone_t *, zone, struct zsd_entry *, t); |
|
1220 |
if (lockp != NULL) { |
|
1221 |
dropped = B_TRUE; |
|
1222 |
mutex_exit(lockp); |
|
1223 |
} |
|
1224 |
cv_wait(&t->zsd_cv, &zone->zone_lock); |
|
1225 |
if (lockp != NULL) { |
|
1226 |
/* First drop zone_lock to preserve order */ |
|
1227 |
mutex_exit(&zone->zone_lock); |
|
1228 |
mutex_enter(lockp); |
|
1229 |
mutex_enter(&zone->zone_lock); |
|
1230 |
} |
|
1231 |
} |
|
1232 |
return (dropped); |
|
1233 |
} |
|
1234 |
||
1235 |
/* |
|
1236 |
* Wait for any INPROGRESS flag to be cleared. |
|
1237 |
* Returns true if lockp was temporarily dropped while waiting. |
|
1238 |
*/ |
|
1239 |
static boolean_t |
|
1240 |
zsd_wait_for_inprogress(zone_t *zone, struct zsd_entry *t, kmutex_t *lockp) |
|
1241 |
{ |
|
1242 |
boolean_t dropped = B_FALSE; |
|
1243 |
||
1244 |
while (t->zsd_flags & ZSD_ALL_INPROGRESS) { |
|
1245 |
DTRACE_PROBE2(zsd__wait__for__inprogress, |
|
1246 |
zone_t *, zone, struct zsd_entry *, t); |
|
1247 |
if (lockp != NULL) { |
|
1248 |
dropped = B_TRUE; |
|
1249 |
mutex_exit(lockp); |
|
1250 |
} |
|
1251 |
cv_wait(&t->zsd_cv, &zone->zone_lock); |
|
1252 |
if (lockp != NULL) { |
|
1253 |
/* First drop zone_lock to preserve order */ |
|
1254 |
mutex_exit(&zone->zone_lock); |
|
1255 |
mutex_enter(lockp); |
|
1256 |
mutex_enter(&zone->zone_lock); |
|
1257 |
} |
|
1258 |
} |
|
1259 |
return (dropped); |
|
0 | 1260 |
} |
1261 |
||
1262 |
/* |
|
789 | 1263 |
* Frees memory associated with the zone dataset list. |
1264 |
*/ |
|
1265 |
static void |
|
1266 |
zone_free_datasets(zone_t *zone) |
|
1267 |
{ |
|
1268 |
zone_dataset_t *t, *next; |
|
1269 |
||
1270 |
for (t = list_head(&zone->zone_datasets); t != NULL; t = next) { |
|
1271 |
next = list_next(&zone->zone_datasets, t); |
|
1272 |
list_remove(&zone->zone_datasets, t); |
|
1273 |
kmem_free(t->zd_dataset, strlen(t->zd_dataset) + 1); |
|
1274 |
kmem_free(t, sizeof (*t)); |
|
1275 |
} |
|
1276 |
list_destroy(&zone->zone_datasets); |
|
1277 |
} |
|
1278 |
||
1279 |
/* |
|
0 | 1280 |
* zone.cpu-shares resource control support. |
1281 |
*/ |
|
1282 |
/*ARGSUSED*/ |
|
1283 |
static rctl_qty_t |
|
1284 |
zone_cpu_shares_usage(rctl_t *rctl, struct proc *p) |
|
1285 |
{ |
|
1286 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1287 |
return (p->p_zone->zone_shares); |
|
1288 |
} |
|
1289 |
||
1290 |
/*ARGSUSED*/ |
|
1291 |
static int |
|
1292 |
zone_cpu_shares_set(rctl_t *rctl, struct proc *p, rctl_entity_p_t *e, |
|
1293 |
rctl_qty_t nv) |
|
1294 |
{ |
|
1295 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1296 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
|
1297 |
if (e->rcep_p.zone == NULL) |
|
1298 |
return (0); |
|
1299 |
||
1300 |
e->rcep_p.zone->zone_shares = nv; |
|
1301 |
return (0); |
|
1302 |
} |
|
1303 |
||
1304 |
static rctl_ops_t zone_cpu_shares_ops = { |
|
1305 |
rcop_no_action, |
|
1306 |
zone_cpu_shares_usage, |
|
1307 |
zone_cpu_shares_set, |
|
1308 |
rcop_no_test |
|
1309 |
}; |
|
1310 |
||
3792 | 1311 |
/* |
1312 |
* zone.cpu-cap resource control support. |
|
1313 |
*/ |
|
1314 |
/*ARGSUSED*/ |
|
1315 |
static rctl_qty_t |
|
1316 |
zone_cpu_cap_get(rctl_t *rctl, struct proc *p) |
|
1317 |
{ |
|
1318 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1319 |
return (cpucaps_zone_get(p->p_zone)); |
|
1320 |
} |
|
1321 |
||
1322 |
/*ARGSUSED*/ |
|
1323 |
static int |
|
1324 |
zone_cpu_cap_set(rctl_t *rctl, struct proc *p, rctl_entity_p_t *e, |
|
1325 |
rctl_qty_t nv) |
|
1326 |
{ |
|
1327 |
zone_t *zone = e->rcep_p.zone; |
|
1328 |
||
1329 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1330 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
|
1331 |
||
1332 |
if (zone == NULL) |
|
1333 |
return (0); |
|
1334 |
||
1335 |
/* |
|
1336 |
* set cap to the new value. |
|
1337 |
*/ |
|
1338 |
return (cpucaps_zone_set(zone, nv)); |
|
1339 |
} |
|
1340 |
||
1341 |
static rctl_ops_t zone_cpu_cap_ops = { |
|
1342 |
rcop_no_action, |
|
1343 |
zone_cpu_cap_get, |
|
1344 |
zone_cpu_cap_set, |
|
1345 |
rcop_no_test |
|
1346 |
}; |
|
1347 |
||
0 | 1348 |
/*ARGSUSED*/ |
1349 |
static rctl_qty_t |
|
1350 |
zone_lwps_usage(rctl_t *r, proc_t *p) |
|
1351 |
{ |
|
1352 |
rctl_qty_t nlwps; |
|
1353 |
zone_t *zone = p->p_zone; |
|
1354 |
||
1355 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1356 |
||
1357 |
mutex_enter(&zone->zone_nlwps_lock); |
|
1358 |
nlwps = zone->zone_nlwps; |
|
1359 |
mutex_exit(&zone->zone_nlwps_lock); |
|
1360 |
||
1361 |
return (nlwps); |
|
1362 |
} |
|
1363 |
||
1364 |
/*ARGSUSED*/ |
|
1365 |
static int |
|
1366 |
zone_lwps_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, rctl_val_t *rcntl, |
|
1367 |
rctl_qty_t incr, uint_t flags) |
|
1368 |
{ |
|
1369 |
rctl_qty_t nlwps; |
|
1370 |
||
1371 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1372 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
|
1373 |
if (e->rcep_p.zone == NULL) |
|
1374 |
return (0); |
|
1375 |
ASSERT(MUTEX_HELD(&(e->rcep_p.zone->zone_nlwps_lock))); |
|
1376 |
nlwps = e->rcep_p.zone->zone_nlwps; |
|
1377 |
||
1378 |
if (nlwps + incr > rcntl->rcv_value) |
|
1379 |
return (1); |
|
1380 |
||
1381 |
return (0); |
|
1382 |
} |
|
1383 |
||
1384 |
/*ARGSUSED*/ |
|
1385 |
static int |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1386 |
zone_lwps_set(rctl_t *rctl, struct proc *p, rctl_entity_p_t *e, rctl_qty_t nv) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1387 |
{ |
0 | 1388 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
1389 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
|
1390 |
if (e->rcep_p.zone == NULL) |
|
1391 |
return (0); |
|
1392 |
e->rcep_p.zone->zone_nlwps_ctl = nv; |
|
1393 |
return (0); |
|
1394 |
} |
|
1395 |
||
1396 |
static rctl_ops_t zone_lwps_ops = { |
|
1397 |
rcop_no_action, |
|
1398 |
zone_lwps_usage, |
|
1399 |
zone_lwps_set, |
|
1400 |
zone_lwps_test, |
|
1401 |
}; |
|
1402 |
||
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1403 |
/*ARGSUSED*/ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1404 |
static int |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1405 |
zone_shmmax_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, rctl_val_t *rval, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1406 |
rctl_qty_t incr, uint_t flags) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1407 |
{ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1408 |
rctl_qty_t v; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1409 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1410 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1411 |
v = e->rcep_p.zone->zone_shmmax + incr; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1412 |
if (v > rval->rcv_value) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1413 |
return (1); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1414 |
return (0); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1415 |
} |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1416 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1417 |
static rctl_ops_t zone_shmmax_ops = { |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1418 |
rcop_no_action, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1419 |
rcop_no_usage, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1420 |
rcop_no_set, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1421 |
zone_shmmax_test |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1422 |
}; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1423 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1424 |
/*ARGSUSED*/ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1425 |
static int |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1426 |
zone_shmmni_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, rctl_val_t *rval, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1427 |
rctl_qty_t incr, uint_t flags) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1428 |
{ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1429 |
rctl_qty_t v; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1430 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1431 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1432 |
v = e->rcep_p.zone->zone_ipc.ipcq_shmmni + incr; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1433 |
if (v > rval->rcv_value) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1434 |
return (1); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1435 |
return (0); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1436 |
} |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1437 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1438 |
static rctl_ops_t zone_shmmni_ops = { |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1439 |
rcop_no_action, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1440 |
rcop_no_usage, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1441 |
rcop_no_set, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1442 |
zone_shmmni_test |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1443 |
}; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1444 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1445 |
/*ARGSUSED*/ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1446 |
static int |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1447 |
zone_semmni_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, rctl_val_t *rval, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1448 |
rctl_qty_t incr, uint_t flags) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1449 |
{ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1450 |
rctl_qty_t v; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1451 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1452 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1453 |
v = e->rcep_p.zone->zone_ipc.ipcq_semmni + incr; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1454 |
if (v > rval->rcv_value) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1455 |
return (1); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1456 |
return (0); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1457 |
} |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1458 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1459 |
static rctl_ops_t zone_semmni_ops = { |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1460 |
rcop_no_action, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1461 |
rcop_no_usage, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1462 |
rcop_no_set, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1463 |
zone_semmni_test |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1464 |
}; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1465 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1466 |
/*ARGSUSED*/ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1467 |
static int |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1468 |
zone_msgmni_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, rctl_val_t *rval, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1469 |
rctl_qty_t incr, uint_t flags) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1470 |
{ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1471 |
rctl_qty_t v; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1472 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1473 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1474 |
v = e->rcep_p.zone->zone_ipc.ipcq_msgmni + incr; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1475 |
if (v > rval->rcv_value) |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1476 |
return (1); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1477 |
return (0); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1478 |
} |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1479 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1480 |
static rctl_ops_t zone_msgmni_ops = { |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1481 |
rcop_no_action, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1482 |
rcop_no_usage, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1483 |
rcop_no_set, |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1484 |
zone_msgmni_test |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1485 |
}; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1486 |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1487 |
/*ARGSUSED*/ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1488 |
static rctl_qty_t |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1489 |
zone_locked_mem_usage(rctl_t *rctl, struct proc *p) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1490 |
{ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1491 |
rctl_qty_t q; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1492 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
3247 | 1493 |
mutex_enter(&p->p_zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1494 |
q = p->p_zone->zone_locked_mem; |
3247 | 1495 |
mutex_exit(&p->p_zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1496 |
return (q); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1497 |
} |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1498 |
|
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1499 |
/*ARGSUSED*/ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1500 |
static int |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1501 |
zone_locked_mem_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1502 |
rctl_val_t *rcntl, rctl_qty_t incr, uint_t flags) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1503 |
{ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1504 |
rctl_qty_t q; |
3247 | 1505 |
zone_t *z; |
1506 |
||
1507 |
z = e->rcep_p.zone; |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1508 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
3247 | 1509 |
ASSERT(MUTEX_HELD(&z->zone_mem_lock)); |
1510 |
q = z->zone_locked_mem; |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1511 |
if (q + incr > rcntl->rcv_value) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1512 |
return (1); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1513 |
return (0); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1514 |
} |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1515 |
|
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1516 |
/*ARGSUSED*/ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1517 |
static int |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1518 |
zone_locked_mem_set(rctl_t *rctl, struct proc *p, rctl_entity_p_t *e, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1519 |
rctl_qty_t nv) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1520 |
{ |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1521 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1522 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1523 |
if (e->rcep_p.zone == NULL) |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1524 |
return (0); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1525 |
e->rcep_p.zone->zone_locked_mem_ctl = nv; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1526 |
return (0); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1527 |
} |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1528 |
|
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1529 |
static rctl_ops_t zone_locked_mem_ops = { |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1530 |
rcop_no_action, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1531 |
zone_locked_mem_usage, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1532 |
zone_locked_mem_set, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1533 |
zone_locked_mem_test |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1534 |
}; |
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1535 |
|
3247 | 1536 |
/*ARGSUSED*/ |
1537 |
static rctl_qty_t |
|
1538 |
zone_max_swap_usage(rctl_t *rctl, struct proc *p) |
|
1539 |
{ |
|
1540 |
rctl_qty_t q; |
|
1541 |
zone_t *z = p->p_zone; |
|
1542 |
||
1543 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1544 |
mutex_enter(&z->zone_mem_lock); |
|
1545 |
q = z->zone_max_swap; |
|
1546 |
mutex_exit(&z->zone_mem_lock); |
|
1547 |
return (q); |
|
1548 |
} |
|
1549 |
||
1550 |
/*ARGSUSED*/ |
|
1551 |
static int |
|
1552 |
zone_max_swap_test(rctl_t *r, proc_t *p, rctl_entity_p_t *e, |
|
1553 |
rctl_val_t *rcntl, rctl_qty_t incr, uint_t flags) |
|
1554 |
{ |
|
1555 |
rctl_qty_t q; |
|
1556 |
zone_t *z; |
|
1557 |
||
1558 |
z = e->rcep_p.zone; |
|
1559 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1560 |
ASSERT(MUTEX_HELD(&z->zone_mem_lock)); |
|
1561 |
q = z->zone_max_swap; |
|
1562 |
if (q + incr > rcntl->rcv_value) |
|
1563 |
return (1); |
|
1564 |
return (0); |
|
1565 |
} |
|
1566 |
||
1567 |
/*ARGSUSED*/ |
|
1568 |
static int |
|
1569 |
zone_max_swap_set(rctl_t *rctl, struct proc *p, rctl_entity_p_t *e, |
|
1570 |
rctl_qty_t nv) |
|
1571 |
{ |
|
1572 |
ASSERT(MUTEX_HELD(&p->p_lock)); |
|
1573 |
ASSERT(e->rcep_t == RCENTITY_ZONE); |
|
1574 |
if (e->rcep_p.zone == NULL) |
|
1575 |
return (0); |
|
1576 |
e->rcep_p.zone->zone_max_swap_ctl = nv; |
|
1577 |
return (0); |
|
1578 |
} |
|
1579 |
||
1580 |
static rctl_ops_t zone_max_swap_ops = { |
|
1581 |
rcop_no_action, |
|
1582 |
zone_max_swap_usage, |
|
1583 |
zone_max_swap_set, |
|
1584 |
zone_max_swap_test |
|
1585 |
}; |
|
1586 |
||
0 | 1587 |
/* |
1588 |
* Helper function to brand the zone with a unique ID. |
|
1589 |
*/ |
|
1590 |
static void |
|
1591 |
zone_uniqid(zone_t *zone) |
|
1592 |
{ |
|
1593 |
static uint64_t uniqid = 0; |
|
1594 |
||
1595 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
1596 |
zone->zone_uniqid = uniqid++; |
|
1597 |
} |
|
1598 |
||
1599 |
/* |
|
1600 |
* Returns a held pointer to the "kcred" for the specified zone. |
|
1601 |
*/ |
|
1602 |
struct cred * |
|
1603 |
zone_get_kcred(zoneid_t zoneid) |
|
1604 |
{ |
|
1605 |
zone_t *zone; |
|
1606 |
cred_t *cr; |
|
1607 |
||
1608 |
if ((zone = zone_find_by_id(zoneid)) == NULL) |
|
1609 |
return (NULL); |
|
1610 |
cr = zone->zone_kcred; |
|
1611 |
crhold(cr); |
|
1612 |
zone_rele(zone); |
|
1613 |
return (cr); |
|
1614 |
} |
|
1615 |
||
3247 | 1616 |
static int |
1617 |
zone_lockedmem_kstat_update(kstat_t *ksp, int rw) |
|
1618 |
{ |
|
1619 |
zone_t *zone = ksp->ks_private; |
|
1620 |
zone_kstat_t *zk = ksp->ks_data; |
|
1621 |
||
1622 |
if (rw == KSTAT_WRITE) |
|
1623 |
return (EACCES); |
|
1624 |
||
1625 |
zk->zk_usage.value.ui64 = zone->zone_locked_mem; |
|
1626 |
zk->zk_value.value.ui64 = zone->zone_locked_mem_ctl; |
|
1627 |
return (0); |
|
1628 |
} |
|
1629 |
||
1630 |
static int |
|
1631 |
zone_swapresv_kstat_update(kstat_t *ksp, int rw) |
|
1632 |
{ |
|
1633 |
zone_t *zone = ksp->ks_private; |
|
1634 |
zone_kstat_t *zk = ksp->ks_data; |
|
1635 |
||
1636 |
if (rw == KSTAT_WRITE) |
|
1637 |
return (EACCES); |
|
1638 |
||
1639 |
zk->zk_usage.value.ui64 = zone->zone_max_swap; |
|
1640 |
zk->zk_value.value.ui64 = zone->zone_max_swap_ctl; |
|
1641 |
return (0); |
|
1642 |
} |
|
1643 |
||
1644 |
static void |
|
1645 |
zone_kstat_create(zone_t *zone) |
|
1646 |
{ |
|
1647 |
kstat_t *ksp; |
|
1648 |
zone_kstat_t *zk; |
|
1649 |
||
1650 |
ksp = rctl_kstat_create_zone(zone, "lockedmem", KSTAT_TYPE_NAMED, |
|
1651 |
sizeof (zone_kstat_t) / sizeof (kstat_named_t), |
|
1652 |
KSTAT_FLAG_VIRTUAL); |
|
1653 |
||
1654 |
if (ksp == NULL) |
|
1655 |
return; |
|
1656 |
||
1657 |
zk = ksp->ks_data = kmem_alloc(sizeof (zone_kstat_t), KM_SLEEP); |
|
1658 |
ksp->ks_data_size += strlen(zone->zone_name) + 1; |
|
1659 |
kstat_named_init(&zk->zk_zonename, "zonename", KSTAT_DATA_STRING); |
|
1660 |
kstat_named_setstr(&zk->zk_zonename, zone->zone_name); |
|
1661 |
kstat_named_init(&zk->zk_usage, "usage", KSTAT_DATA_UINT64); |
|
1662 |
kstat_named_init(&zk->zk_value, "value", KSTAT_DATA_UINT64); |
|
1663 |
ksp->ks_update = zone_lockedmem_kstat_update; |
|
1664 |
ksp->ks_private = zone; |
|
1665 |
kstat_install(ksp); |
|
1666 |
||
1667 |
zone->zone_lockedmem_kstat = ksp; |
|
1668 |
||
1669 |
ksp = rctl_kstat_create_zone(zone, "swapresv", KSTAT_TYPE_NAMED, |
|
1670 |
sizeof (zone_kstat_t) / sizeof (kstat_named_t), |
|
1671 |
KSTAT_FLAG_VIRTUAL); |
|
1672 |
||
1673 |
if (ksp == NULL) |
|
1674 |
return; |
|
1675 |
||
1676 |
zk = ksp->ks_data = kmem_alloc(sizeof (zone_kstat_t), KM_SLEEP); |
|
1677 |
ksp->ks_data_size += strlen(zone->zone_name) + 1; |
|
1678 |
kstat_named_init(&zk->zk_zonename, "zonename", KSTAT_DATA_STRING); |
|
1679 |
kstat_named_setstr(&zk->zk_zonename, zone->zone_name); |
|
1680 |
kstat_named_init(&zk->zk_usage, "usage", KSTAT_DATA_UINT64); |
|
1681 |
kstat_named_init(&zk->zk_value, "value", KSTAT_DATA_UINT64); |
|
1682 |
ksp->ks_update = zone_swapresv_kstat_update; |
|
1683 |
ksp->ks_private = zone; |
|
1684 |
kstat_install(ksp); |
|
1685 |
||
1686 |
zone->zone_swapresv_kstat = ksp; |
|
1687 |
} |
|
1688 |
||
1689 |
static void |
|
1690 |
zone_kstat_delete(zone_t *zone) |
|
1691 |
{ |
|
1692 |
void *data; |
|
1693 |
||
1694 |
if (zone->zone_lockedmem_kstat != NULL) { |
|
1695 |
data = zone->zone_lockedmem_kstat->ks_data; |
|
1696 |
kstat_delete(zone->zone_lockedmem_kstat); |
|
1697 |
kmem_free(data, sizeof (zone_kstat_t)); |
|
1698 |
} |
|
1699 |
if (zone->zone_swapresv_kstat != NULL) { |
|
1700 |
data = zone->zone_swapresv_kstat->ks_data; |
|
1701 |
kstat_delete(zone->zone_swapresv_kstat); |
|
1702 |
kmem_free(data, sizeof (zone_kstat_t)); |
|
1703 |
} |
|
1704 |
} |
|
1705 |
||
0 | 1706 |
/* |
1707 |
* Called very early on in boot to initialize the ZSD list so that |
|
1708 |
* zone_key_create() can be called before zone_init(). It also initializes |
|
1709 |
* portions of zone0 which may be used before zone_init() is called. The |
|
1710 |
* variable "global_zone" will be set when zone0 is fully initialized by |
|
1711 |
* zone_init(). |
|
1712 |
*/ |
|
1713 |
void |
|
1714 |
zone_zsd_init(void) |
|
1715 |
{ |
|
1716 |
mutex_init(&zonehash_lock, NULL, MUTEX_DEFAULT, NULL); |
|
1717 |
mutex_init(&zsd_key_lock, NULL, MUTEX_DEFAULT, NULL); |
|
1718 |
list_create(&zsd_registered_keys, sizeof (struct zsd_entry), |
|
1719 |
offsetof(struct zsd_entry, zsd_linkage)); |
|
1720 |
list_create(&zone_active, sizeof (zone_t), |
|
1721 |
offsetof(zone_t, zone_linkage)); |
|
1722 |
list_create(&zone_deathrow, sizeof (zone_t), |
|
1723 |
offsetof(zone_t, zone_linkage)); |
|
1724 |
||
1725 |
mutex_init(&zone0.zone_lock, NULL, MUTEX_DEFAULT, NULL); |
|
1726 |
mutex_init(&zone0.zone_nlwps_lock, NULL, MUTEX_DEFAULT, NULL); |
|
3247 | 1727 |
mutex_init(&zone0.zone_mem_lock, NULL, MUTEX_DEFAULT, NULL); |
0 | 1728 |
zone0.zone_shares = 1; |
3247 | 1729 |
zone0.zone_nlwps = 0; |
0 | 1730 |
zone0.zone_nlwps_ctl = INT_MAX; |
3247 | 1731 |
zone0.zone_locked_mem = 0; |
1732 |
zone0.zone_locked_mem_ctl = UINT64_MAX; |
|
1733 |
ASSERT(zone0.zone_max_swap == 0); |
|
1734 |
zone0.zone_max_swap_ctl = UINT64_MAX; |
|
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1735 |
zone0.zone_shmmax = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1736 |
zone0.zone_ipc.ipcq_shmmni = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1737 |
zone0.zone_ipc.ipcq_semmni = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1738 |
zone0.zone_ipc.ipcq_msgmni = 0; |
0 | 1739 |
zone0.zone_name = GLOBAL_ZONENAME; |
1740 |
zone0.zone_nodename = utsname.nodename; |
|
1741 |
zone0.zone_domain = srpc_domain; |
|
8662
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
1742 |
zone0.zone_hostid = HW_INVALID_HOSTID; |
0 | 1743 |
zone0.zone_ref = 1; |
1744 |
zone0.zone_id = GLOBAL_ZONEID; |
|
1745 |
zone0.zone_status = ZONE_IS_RUNNING; |
|
1746 |
zone0.zone_rootpath = "/"; |
|
1747 |
zone0.zone_rootpathlen = 2; |
|
1748 |
zone0.zone_psetid = ZONE_PS_INVAL; |
|
1749 |
zone0.zone_ncpus = 0; |
|
1750 |
zone0.zone_ncpus_online = 0; |
|
1751 |
zone0.zone_proc_initpid = 1; |
|
2267 | 1752 |
zone0.zone_initname = initname; |
3247 | 1753 |
zone0.zone_lockedmem_kstat = NULL; |
1754 |
zone0.zone_swapresv_kstat = NULL; |
|
0 | 1755 |
list_create(&zone0.zone_zsd, sizeof (struct zsd_entry), |
1756 |
offsetof(struct zsd_entry, zsd_linkage)); |
|
1757 |
list_insert_head(&zone_active, &zone0); |
|
1758 |
||
1759 |
/* |
|
1760 |
* The root filesystem is not mounted yet, so zone_rootvp cannot be set |
|
1761 |
* to anything meaningful. It is assigned to be 'rootdir' in |
|
1762 |
* vfs_mountroot(). |
|
1763 |
*/ |
|
1764 |
zone0.zone_rootvp = NULL; |
|
1765 |
zone0.zone_vfslist = NULL; |
|
2267 | 1766 |
zone0.zone_bootargs = initargs; |
0 | 1767 |
zone0.zone_privset = kmem_alloc(sizeof (priv_set_t), KM_SLEEP); |
1768 |
/* |
|
1769 |
* The global zone has all privileges |
|
1770 |
*/ |
|
1771 |
priv_fillset(zone0.zone_privset); |
|
1772 |
/* |
|
1773 |
* Add p0 to the global zone |
|
1774 |
*/ |
|
1775 |
zone0.zone_zsched = &p0; |
|
1776 |
p0.p_zone = &zone0; |
|
1777 |
} |
|
1778 |
||
1779 |
/* |
|
1676 | 1780 |
* Compute a hash value based on the contents of the label and the DOI. The |
1781 |
* hash algorithm is somewhat arbitrary, but is based on the observation that |
|
1782 |
* humans will likely pick labels that differ by amounts that work out to be |
|
1783 |
* multiples of the number of hash chains, and thus stirring in some primes |
|
1784 |
* should help. |
|
1785 |
*/ |
|
1786 |
static uint_t |
|
1787 |
hash_bylabel(void *hdata, mod_hash_key_t key) |
|
1788 |
{ |
|
1789 |
const ts_label_t *lab = (ts_label_t *)key; |
|
1790 |
const uint32_t *up, *ue; |
|
1791 |
uint_t hash; |
|
1792 |
int i; |
|
1793 |
||
1794 |
_NOTE(ARGUNUSED(hdata)); |
|
1795 |
||
1796 |
hash = lab->tsl_doi + (lab->tsl_doi << 1); |
|
1797 |
/* we depend on alignment of label, but not representation */ |
|
1798 |
up = (const uint32_t *)&lab->tsl_label; |
|
1799 |
ue = up + sizeof (lab->tsl_label) / sizeof (*up); |
|
1800 |
i = 1; |
|
1801 |
while (up < ue) { |
|
1802 |
/* using 2^n + 1, 1 <= n <= 16 as source of many primes */ |
|
1803 |
hash += *up + (*up << ((i % 16) + 1)); |
|
1804 |
up++; |
|
1805 |
i++; |
|
1806 |
} |
|
1807 |
return (hash); |
|
1808 |
} |
|
1809 |
||
1810 |
/* |
|
1811 |
* All that mod_hash cares about here is zero (equal) versus non-zero (not |
|
1812 |
* equal). This may need to be changed if less than / greater than is ever |
|
1813 |
* needed. |
|
1814 |
*/ |
|
1815 |
static int |
|
1816 |
hash_labelkey_cmp(mod_hash_key_t key1, mod_hash_key_t key2) |
|
1817 |
{ |
|
1818 |
ts_label_t *lab1 = (ts_label_t *)key1; |
|
1819 |
ts_label_t *lab2 = (ts_label_t *)key2; |
|
1820 |
||
1821 |
return (label_equal(lab1, lab2) ? 0 : 1); |
|
1822 |
} |
|
1823 |
||
1824 |
/* |
|
0 | 1825 |
* Called by main() to initialize the zones framework. |
1826 |
*/ |
|
1827 |
void |
|
1828 |
zone_init(void) |
|
1829 |
{ |
|
1830 |
rctl_dict_entry_t *rde; |
|
1831 |
rctl_val_t *dval; |
|
1832 |
rctl_set_t *set; |
|
1833 |
rctl_alloc_gp_t *gp; |
|
1834 |
rctl_entity_p_t e; |
|
1166 | 1835 |
int res; |
0 | 1836 |
|
1837 |
ASSERT(curproc == &p0); |
|
1838 |
||
1839 |
/* |
|
1840 |
* Create ID space for zone IDs. ID 0 is reserved for the |
|
1841 |
* global zone. |
|
1842 |
*/ |
|
1843 |
zoneid_space = id_space_create("zoneid_space", 1, MAX_ZONEID); |
|
1844 |
||
1845 |
/* |
|
1846 |
* Initialize generic zone resource controls, if any. |
|
1847 |
*/ |
|
1848 |
rc_zone_cpu_shares = rctl_register("zone.cpu-shares", |
|
1849 |
RCENTITY_ZONE, RCTL_GLOBAL_SIGNAL_NEVER | RCTL_GLOBAL_DENY_NEVER | |
|
1996
1bd5128dcd61
6294710 rctladm incorrectly claims and reports it can log to syslog for project.cpu-shares
ml93401
parents:
1876
diff
changeset
|
1850 |
RCTL_GLOBAL_NOBASIC | RCTL_GLOBAL_COUNT | RCTL_GLOBAL_SYSLOG_NEVER, |
3792 | 1851 |
FSS_MAXSHARES, FSS_MAXSHARES, &zone_cpu_shares_ops); |
1852 |
||
1853 |
rc_zone_cpu_cap = rctl_register("zone.cpu-cap", |
|
1854 |
RCENTITY_ZONE, RCTL_GLOBAL_SIGNAL_NEVER | RCTL_GLOBAL_DENY_ALWAYS | |
|
1855 |
RCTL_GLOBAL_NOBASIC | RCTL_GLOBAL_COUNT |RCTL_GLOBAL_SYSLOG_NEVER | |
|
1856 |
RCTL_GLOBAL_INFINITE, |
|
1857 |
MAXCAP, MAXCAP, &zone_cpu_cap_ops); |
|
0 | 1858 |
|
1859 |
rc_zone_nlwps = rctl_register("zone.max-lwps", RCENTITY_ZONE, |
|
1860 |
RCTL_GLOBAL_NOACTION | RCTL_GLOBAL_NOBASIC | RCTL_GLOBAL_COUNT, |
|
1861 |
INT_MAX, INT_MAX, &zone_lwps_ops); |
|
1862 |
/* |
|
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1863 |
* System V IPC resource controls |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1864 |
*/ |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1865 |
rc_zone_msgmni = rctl_register("zone.max-msg-ids", |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1866 |
RCENTITY_ZONE, RCTL_GLOBAL_DENY_ALWAYS | RCTL_GLOBAL_NOBASIC | |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1867 |
RCTL_GLOBAL_COUNT, IPC_IDS_MAX, IPC_IDS_MAX, &zone_msgmni_ops); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1868 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1869 |
rc_zone_semmni = rctl_register("zone.max-sem-ids", |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1870 |
RCENTITY_ZONE, RCTL_GLOBAL_DENY_ALWAYS | RCTL_GLOBAL_NOBASIC | |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1871 |
RCTL_GLOBAL_COUNT, IPC_IDS_MAX, IPC_IDS_MAX, &zone_semmni_ops); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1872 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1873 |
rc_zone_shmmni = rctl_register("zone.max-shm-ids", |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1874 |
RCENTITY_ZONE, RCTL_GLOBAL_DENY_ALWAYS | RCTL_GLOBAL_NOBASIC | |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1875 |
RCTL_GLOBAL_COUNT, IPC_IDS_MAX, IPC_IDS_MAX, &zone_shmmni_ops); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1876 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1877 |
rc_zone_shmmax = rctl_register("zone.max-shm-memory", |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1878 |
RCENTITY_ZONE, RCTL_GLOBAL_DENY_ALWAYS | RCTL_GLOBAL_NOBASIC | |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1879 |
RCTL_GLOBAL_BYTES, UINT64_MAX, UINT64_MAX, &zone_shmmax_ops); |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1880 |
|
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
1881 |
/* |
0 | 1882 |
* Create a rctl_val with PRIVILEGED, NOACTION, value = 1. Then attach |
1883 |
* this at the head of the rctl_dict_entry for ``zone.cpu-shares''. |
|
1884 |
*/ |
|
1885 |
dval = kmem_cache_alloc(rctl_val_cache, KM_SLEEP); |
|
1886 |
bzero(dval, sizeof (rctl_val_t)); |
|
1887 |
dval->rcv_value = 1; |
|
1888 |
dval->rcv_privilege = RCPRIV_PRIVILEGED; |
|
1889 |
dval->rcv_flagaction = RCTL_LOCAL_NOACTION; |
|
1890 |
dval->rcv_action_recip_pid = -1; |
|
1891 |
||
1892 |
rde = rctl_dict_lookup("zone.cpu-shares"); |
|
1893 |
(void) rctl_val_list_insert(&rde->rcd_default_value, dval); |
|
1894 |
||
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1895 |
rc_zone_locked_mem = rctl_register("zone.max-locked-memory", |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1896 |
RCENTITY_ZONE, RCTL_GLOBAL_NOBASIC | RCTL_GLOBAL_BYTES | |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1897 |
RCTL_GLOBAL_DENY_ALWAYS, UINT64_MAX, UINT64_MAX, |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
1898 |
&zone_locked_mem_ops); |
3247 | 1899 |
|
1900 |
rc_zone_max_swap = rctl_register("zone.max-swap", |
|
1901 |
RCENTITY_ZONE, RCTL_GLOBAL_NOBASIC | RCTL_GLOBAL_BYTES | |
|
1902 |
RCTL_GLOBAL_DENY_ALWAYS, UINT64_MAX, UINT64_MAX, |
|
1903 |
&zone_max_swap_ops); |
|
1904 |
||
0 | 1905 |
/* |
1906 |
* Initialize the ``global zone''. |
|
1907 |
*/ |
|
1908 |
set = rctl_set_create(); |
|
1909 |
gp = rctl_set_init_prealloc(RCENTITY_ZONE); |
|
1910 |
mutex_enter(&p0.p_lock); |
|
1911 |
e.rcep_p.zone = &zone0; |
|
1912 |
e.rcep_t = RCENTITY_ZONE; |
|
1913 |
zone0.zone_rctls = rctl_set_init(RCENTITY_ZONE, &p0, &e, set, |
|
1914 |
gp); |
|
1915 |
||
1916 |
zone0.zone_nlwps = p0.p_lwpcnt; |
|
1917 |
zone0.zone_ntasks = 1; |
|
1918 |
mutex_exit(&p0.p_lock); |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
1919 |
zone0.zone_restart_init = B_TRUE; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
1920 |
zone0.zone_brand = &native_brand; |
0 | 1921 |
rctl_prealloc_destroy(gp); |
1922 |
/* |
|
3247 | 1923 |
* pool_default hasn't been initialized yet, so we let pool_init() |
1924 |
* take care of making sure the global zone is in the default pool. |
|
0 | 1925 |
*/ |
1676 | 1926 |
|
1927 |
/* |
|
3247 | 1928 |
* Initialize global zone kstats |
1929 |
*/ |
|
1930 |
zone_kstat_create(&zone0); |
|
1931 |
||
1932 |
/* |
|
1676 | 1933 |
* Initialize zone label. |
1934 |
* mlp are initialized when tnzonecfg is loaded. |
|
1935 |
*/ |
|
1936 |
zone0.zone_slabel = l_admin_low; |
|
1937 |
rw_init(&zone0.zone_mlps.mlpl_rwlock, NULL, RW_DEFAULT, NULL); |
|
1938 |
label_hold(l_admin_low); |
|
1939 |
||
10910
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
1940 |
/* |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
1941 |
* Initialise the lock for the database structure used by mntfs. |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
1942 |
*/ |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
1943 |
rw_init(&zone0.zone_mntfs_db_lock, NULL, RW_DEFAULT, NULL); |
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
1944 |
|
0 | 1945 |
mutex_enter(&zonehash_lock); |
1946 |
zone_uniqid(&zone0); |
|
1947 |
ASSERT(zone0.zone_uniqid == GLOBAL_ZONEUNIQID); |
|
1676 | 1948 |
|
0 | 1949 |
zonehashbyid = mod_hash_create_idhash("zone_by_id", zone_hash_size, |
1950 |
mod_hash_null_valdtor); |
|
1951 |
zonehashbyname = mod_hash_create_strhash("zone_by_name", |
|
1952 |
zone_hash_size, mod_hash_null_valdtor); |
|
1676 | 1953 |
/* |
1954 |
* maintain zonehashbylabel only for labeled systems |
|
1955 |
*/ |
|
1956 |
if (is_system_labeled()) |
|
1957 |
zonehashbylabel = mod_hash_create_extended("zone_by_label", |
|
1958 |
zone_hash_size, mod_hash_null_keydtor, |
|
1959 |
mod_hash_null_valdtor, hash_bylabel, NULL, |
|
1960 |
hash_labelkey_cmp, KM_SLEEP); |
|
0 | 1961 |
zonecount = 1; |
1962 |
||
1963 |
(void) mod_hash_insert(zonehashbyid, (mod_hash_key_t)GLOBAL_ZONEID, |
|
1964 |
(mod_hash_val_t)&zone0); |
|
1965 |
(void) mod_hash_insert(zonehashbyname, (mod_hash_key_t)zone0.zone_name, |
|
1966 |
(mod_hash_val_t)&zone0); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
1967 |
if (is_system_labeled()) { |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
1968 |
zone0.zone_flags |= ZF_HASHED_LABEL; |
1676 | 1969 |
(void) mod_hash_insert(zonehashbylabel, |
1970 |
(mod_hash_key_t)zone0.zone_slabel, (mod_hash_val_t)&zone0); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
1971 |
} |
1676 | 1972 |
mutex_exit(&zonehash_lock); |
1973 |
||
0 | 1974 |
/* |
1975 |
* We avoid setting zone_kcred until now, since kcred is initialized |
|
1976 |
* sometime after zone_zsd_init() and before zone_init(). |
|
1977 |
*/ |
|
1978 |
zone0.zone_kcred = kcred; |
|
1979 |
/* |
|
1980 |
* The global zone is fully initialized (except for zone_rootvp which |
|
1981 |
* will be set when the root filesystem is mounted). |
|
1982 |
*/ |
|
1983 |
global_zone = &zone0; |
|
1166 | 1984 |
|
1985 |
/* |
|
1986 |
* Setup an event channel to send zone status change notifications on |
|
1987 |
*/ |
|
1988 |
res = sysevent_evc_bind(ZONE_EVENT_CHANNEL, &zone_event_chan, |
|
1989 |
EVCH_CREAT); |
|
1990 |
||
1991 |
if (res) |
|
1992 |
panic("Sysevent_evc_bind failed during zone setup.\n"); |
|
3247 | 1993 |
|
0 | 1994 |
} |
1995 |
||
1996 |
static void |
|
1997 |
zone_free(zone_t *zone) |
|
1998 |
{ |
|
1999 |
ASSERT(zone != global_zone); |
|
2000 |
ASSERT(zone->zone_ntasks == 0); |
|
2001 |
ASSERT(zone->zone_nlwps == 0); |
|
2002 |
ASSERT(zone->zone_cred_ref == 0); |
|
2003 |
ASSERT(zone->zone_kcred == NULL); |
|
2004 |
ASSERT(zone_status_get(zone) == ZONE_IS_DEAD || |
|
2005 |
zone_status_get(zone) == ZONE_IS_UNINITIALIZED); |
|
2006 |
||
3792 | 2007 |
/* |
2008 |
* Remove any zone caps. |
|
2009 |
*/ |
|
2010 |
cpucaps_zone_remove(zone); |
|
2011 |
||
2012 |
ASSERT(zone->zone_cpucap == NULL); |
|
2013 |
||
0 | 2014 |
/* remove from deathrow list */ |
2015 |
if (zone_status_get(zone) == ZONE_IS_DEAD) { |
|
2016 |
ASSERT(zone->zone_ref == 0); |
|
2017 |
mutex_enter(&zone_deathrow_lock); |
|
2018 |
list_remove(&zone_deathrow, zone); |
|
2019 |
mutex_exit(&zone_deathrow_lock); |
|
2020 |
} |
|
2021 |
||
2022 |
zone_free_zsd(zone); |
|
789 | 2023 |
zone_free_datasets(zone); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
2024 |
list_destroy(&zone->zone_dl_list); |
0 | 2025 |
|
2026 |
if (zone->zone_rootvp != NULL) |
|
2027 |
VN_RELE(zone->zone_rootvp); |
|
2028 |
if (zone->zone_rootpath) |
|
2029 |
kmem_free(zone->zone_rootpath, zone->zone_rootpathlen); |
|
2030 |
if (zone->zone_name != NULL) |
|
2031 |
kmem_free(zone->zone_name, ZONENAME_MAX); |
|
1676 | 2032 |
if (zone->zone_slabel != NULL) |
2033 |
label_rele(zone->zone_slabel); |
|
0 | 2034 |
if (zone->zone_nodename != NULL) |
2035 |
kmem_free(zone->zone_nodename, _SYS_NMLN); |
|
2036 |
if (zone->zone_domain != NULL) |
|
2037 |
kmem_free(zone->zone_domain, _SYS_NMLN); |
|
2038 |
if (zone->zone_privset != NULL) |
|
2039 |
kmem_free(zone->zone_privset, sizeof (priv_set_t)); |
|
2040 |
if (zone->zone_rctls != NULL) |
|
2041 |
rctl_set_free(zone->zone_rctls); |
|
2042 |
if (zone->zone_bootargs != NULL) |
|
2267 | 2043 |
kmem_free(zone->zone_bootargs, strlen(zone->zone_bootargs) + 1); |
2044 |
if (zone->zone_initname != NULL) |
|
2045 |
kmem_free(zone->zone_initname, strlen(zone->zone_initname) + 1); |
|
12273
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
2046 |
if (zone->zone_pfexecd != NULL) |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
2047 |
klpd_freelist(&zone->zone_pfexecd); |
0 | 2048 |
id_free(zoneid_space, zone->zone_id); |
2049 |
mutex_destroy(&zone->zone_lock); |
|
2050 |
cv_destroy(&zone->zone_cv); |
|
1676 | 2051 |
rw_destroy(&zone->zone_mlps.mlpl_rwlock); |
10910
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
2052 |
rw_destroy(&zone->zone_mntfs_db_lock); |
0 | 2053 |
kmem_free(zone, sizeof (zone_t)); |
2054 |
} |
|
2055 |
||
2056 |
/* |
|
2057 |
* See block comment at the top of this file for information about zone |
|
2058 |
* status values. |
|
2059 |
*/ |
|
2060 |
/* |
|
2061 |
* Convenience function for setting zone status. |
|
2062 |
*/ |
|
2063 |
static void |
|
2064 |
zone_status_set(zone_t *zone, zone_status_t status) |
|
2065 |
{ |
|
1166 | 2066 |
|
2067 |
nvlist_t *nvl = NULL; |
|
0 | 2068 |
ASSERT(MUTEX_HELD(&zone_status_lock)); |
2069 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE && |
|
2070 |
status >= zone_status_get(zone)); |
|
1166 | 2071 |
|
2072 |
if (nvlist_alloc(&nvl, NV_UNIQUE_NAME, KM_SLEEP) || |
|
2073 |
nvlist_add_string(nvl, ZONE_CB_NAME, zone->zone_name) || |
|
2074 |
nvlist_add_string(nvl, ZONE_CB_NEWSTATE, |
|
2267 | 2075 |
zone_status_table[status]) || |
1166 | 2076 |
nvlist_add_string(nvl, ZONE_CB_OLDSTATE, |
2267 | 2077 |
zone_status_table[zone->zone_status]) || |
1166 | 2078 |
nvlist_add_int32(nvl, ZONE_CB_ZONEID, zone->zone_id) || |
2079 |
nvlist_add_uint64(nvl, ZONE_CB_TIMESTAMP, (uint64_t)gethrtime()) || |
|
2080 |
sysevent_evc_publish(zone_event_chan, ZONE_EVENT_STATUS_CLASS, |
|
2267 | 2081 |
ZONE_EVENT_STATUS_SUBCLASS, "sun.com", "kernel", nvl, EVCH_SLEEP)) { |
1166 | 2082 |
#ifdef DEBUG |
2083 |
(void) printf( |
|
2084 |
"Failed to allocate and send zone state change event.\n"); |
|
2085 |
#endif |
|
2086 |
} |
|
2087 |
nvlist_free(nvl); |
|
2088 |
||
0 | 2089 |
zone->zone_status = status; |
1166 | 2090 |
|
0 | 2091 |
cv_broadcast(&zone->zone_cv); |
2092 |
} |
|
2093 |
||
2094 |
/* |
|
2095 |
* Public function to retrieve the zone status. The zone status may |
|
2096 |
* change after it is retrieved. |
|
2097 |
*/ |
|
2098 |
zone_status_t |
|
2099 |
zone_status_get(zone_t *zone) |
|
2100 |
{ |
|
2101 |
return (zone->zone_status); |
|
2102 |
} |
|
2103 |
||
2104 |
static int |
|
2105 |
zone_set_bootargs(zone_t *zone, const char *zone_bootargs) |
|
2106 |
{ |
|
2267 | 2107 |
char *bootargs = kmem_zalloc(BOOTARGS_MAX, KM_SLEEP); |
2108 |
int err = 0; |
|
2109 |
||
2110 |
ASSERT(zone != global_zone); |
|
2111 |
if ((err = copyinstr(zone_bootargs, bootargs, BOOTARGS_MAX, NULL)) != 0) |
|
2112 |
goto done; /* EFAULT or ENAMETOOLONG */ |
|
2113 |
||
2114 |
if (zone->zone_bootargs != NULL) |
|
2115 |
kmem_free(zone->zone_bootargs, strlen(zone->zone_bootargs) + 1); |
|
2116 |
||
2117 |
zone->zone_bootargs = kmem_alloc(strlen(bootargs) + 1, KM_SLEEP); |
|
2118 |
(void) strcpy(zone->zone_bootargs, bootargs); |
|
2119 |
||
2120 |
done: |
|
2121 |
kmem_free(bootargs, BOOTARGS_MAX); |
|
2122 |
return (err); |
|
2123 |
} |
|
2124 |
||
2125 |
static int |
|
4141
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2126 |
zone_set_brand(zone_t *zone, const char *brand) |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2127 |
{ |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2128 |
struct brand_attr *attrp; |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2129 |
brand_t *bp; |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2130 |
|
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2131 |
attrp = kmem_alloc(sizeof (struct brand_attr), KM_SLEEP); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2132 |
if (copyin(brand, attrp, sizeof (struct brand_attr)) != 0) { |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2133 |
kmem_free(attrp, sizeof (struct brand_attr)); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2134 |
return (EFAULT); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2135 |
} |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2136 |
|
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2137 |
bp = brand_register_zone(attrp); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2138 |
kmem_free(attrp, sizeof (struct brand_attr)); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2139 |
if (bp == NULL) |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2140 |
return (EINVAL); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2141 |
|
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2142 |
/* |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2143 |
* This is the only place where a zone can change it's brand. |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2144 |
* We already need to hold zone_status_lock to check the zone |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2145 |
* status, so we'll just use that lock to serialize zone |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2146 |
* branding requests as well. |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2147 |
*/ |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2148 |
mutex_enter(&zone_status_lock); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2149 |
|
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2150 |
/* Re-Branding is not allowed and the zone can't be booted yet */ |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2151 |
if ((ZONE_IS_BRANDED(zone)) || |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2152 |
(zone_status_get(zone) >= ZONE_IS_BOOTING)) { |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2153 |
mutex_exit(&zone_status_lock); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2154 |
brand_unregister_zone(bp); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2155 |
return (EINVAL); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2156 |
} |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2157 |
|
4888
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
2158 |
/* set up the brand specific data */ |
4141
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2159 |
zone->zone_brand = bp; |
4888
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
2160 |
ZBROP(zone)->b_init_brand_data(zone); |
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
2161 |
|
4141
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2162 |
mutex_exit(&zone_status_lock); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2163 |
return (0); |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2164 |
} |
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2165 |
|
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
2166 |
static int |
2267 | 2167 |
zone_set_initname(zone_t *zone, const char *zone_initname) |
2168 |
{ |
|
2169 |
char initname[INITNAME_SZ]; |
|
0 | 2170 |
size_t len; |
2267 | 2171 |
int err = 0; |
2172 |
||
2173 |
ASSERT(zone != global_zone); |
|
2174 |
if ((err = copyinstr(zone_initname, initname, INITNAME_SZ, &len)) != 0) |
|
0 | 2175 |
return (err); /* EFAULT or ENAMETOOLONG */ |
2267 | 2176 |
|
2177 |
if (zone->zone_initname != NULL) |
|
2178 |
kmem_free(zone->zone_initname, strlen(zone->zone_initname) + 1); |
|
2179 |
||
2180 |
zone->zone_initname = kmem_alloc(strlen(initname) + 1, KM_SLEEP); |
|
2181 |
(void) strcpy(zone->zone_initname, initname); |
|
0 | 2182 |
return (0); |
2183 |
} |
|
2184 |
||
3247 | 2185 |
static int |
2186 |
zone_set_phys_mcap(zone_t *zone, const uint64_t *zone_mcap) |
|
2187 |
{ |
|
2188 |
uint64_t mcap; |
|
2189 |
int err = 0; |
|
2190 |
||
2191 |
if ((err = copyin(zone_mcap, &mcap, sizeof (uint64_t))) == 0) |
|
2192 |
zone->zone_phys_mcap = mcap; |
|
2193 |
||
2194 |
return (err); |
|
2195 |
} |
|
2196 |
||
2197 |
static int |
|
2198 |
zone_set_sched_class(zone_t *zone, const char *new_class) |
|
2199 |
{ |
|
2200 |
char sched_class[PC_CLNMSZ]; |
|
2201 |
id_t classid; |
|
2202 |
int err; |
|
2203 |
||
2204 |
ASSERT(zone != global_zone); |
|
2205 |
if ((err = copyinstr(new_class, sched_class, PC_CLNMSZ, NULL)) != 0) |
|
2206 |
return (err); /* EFAULT or ENAMETOOLONG */ |
|
2207 |
||
11173
87f3734e64df
6881015 ZFS write activity prevents other threads from running in a timely manner
Jonathan Adams <Jonathan.Adams@Sun.COM>
parents:
11066
diff
changeset
|
2208 |
if (getcid(sched_class, &classid) != 0 || CLASS_KERNEL(classid)) |
3247 | 2209 |
return (set_errno(EINVAL)); |
2210 |
zone->zone_defaultcid = classid; |
|
2211 |
ASSERT(zone->zone_defaultcid > 0 && |
|
2212 |
zone->zone_defaultcid < loaded_classes); |
|
2213 |
||
2214 |
return (0); |
|
2215 |
} |
|
2216 |
||
0 | 2217 |
/* |
2218 |
* Block indefinitely waiting for (zone_status >= status) |
|
2219 |
*/ |
|
2220 |
void |
|
2221 |
zone_status_wait(zone_t *zone, zone_status_t status) |
|
2222 |
{ |
|
2223 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE); |
|
2224 |
||
2225 |
mutex_enter(&zone_status_lock); |
|
2226 |
while (zone->zone_status < status) { |
|
2227 |
cv_wait(&zone->zone_cv, &zone_status_lock); |
|
2228 |
} |
|
2229 |
mutex_exit(&zone_status_lock); |
|
2230 |
} |
|
2231 |
||
2232 |
/* |
|
2233 |
* Private CPR-safe version of zone_status_wait(). |
|
2234 |
*/ |
|
2235 |
static void |
|
2236 |
zone_status_wait_cpr(zone_t *zone, zone_status_t status, char *str) |
|
2237 |
{ |
|
2238 |
callb_cpr_t cprinfo; |
|
2239 |
||
2240 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE); |
|
2241 |
||
2242 |
CALLB_CPR_INIT(&cprinfo, &zone_status_lock, callb_generic_cpr, |
|
2243 |
str); |
|
2244 |
mutex_enter(&zone_status_lock); |
|
2245 |
while (zone->zone_status < status) { |
|
2246 |
CALLB_CPR_SAFE_BEGIN(&cprinfo); |
|
2247 |
cv_wait(&zone->zone_cv, &zone_status_lock); |
|
2248 |
CALLB_CPR_SAFE_END(&cprinfo, &zone_status_lock); |
|
2249 |
} |
|
2250 |
/* |
|
2251 |
* zone_status_lock is implicitly released by the following. |
|
2252 |
*/ |
|
2253 |
CALLB_CPR_EXIT(&cprinfo); |
|
2254 |
} |
|
2255 |
||
2256 |
/* |
|
2257 |
* Block until zone enters requested state or signal is received. Return (0) |
|
2258 |
* if signaled, non-zero otherwise. |
|
2259 |
*/ |
|
2260 |
int |
|
2261 |
zone_status_wait_sig(zone_t *zone, zone_status_t status) |
|
2262 |
{ |
|
2263 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE); |
|
2264 |
||
2265 |
mutex_enter(&zone_status_lock); |
|
2266 |
while (zone->zone_status < status) { |
|
2267 |
if (!cv_wait_sig(&zone->zone_cv, &zone_status_lock)) { |
|
2268 |
mutex_exit(&zone_status_lock); |
|
2269 |
return (0); |
|
2270 |
} |
|
2271 |
} |
|
2272 |
mutex_exit(&zone_status_lock); |
|
2273 |
return (1); |
|
2274 |
} |
|
2275 |
||
2276 |
/* |
|
2277 |
* Block until the zone enters the requested state or the timeout expires, |
|
2278 |
* whichever happens first. Return (-1) if operation timed out, time remaining |
|
2279 |
* otherwise. |
|
2280 |
*/ |
|
2281 |
clock_t |
|
2282 |
zone_status_timedwait(zone_t *zone, clock_t tim, zone_status_t status) |
|
2283 |
{ |
|
2284 |
clock_t timeleft = 0; |
|
2285 |
||
2286 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE); |
|
2287 |
||
2288 |
mutex_enter(&zone_status_lock); |
|
2289 |
while (zone->zone_status < status && timeleft != -1) { |
|
2290 |
timeleft = cv_timedwait(&zone->zone_cv, &zone_status_lock, tim); |
|
2291 |
} |
|
2292 |
mutex_exit(&zone_status_lock); |
|
2293 |
return (timeleft); |
|
2294 |
} |
|
2295 |
||
2296 |
/* |
|
2297 |
* Block until the zone enters the requested state, the current process is |
|
2298 |
* signaled, or the timeout expires, whichever happens first. Return (-1) if |
|
2299 |
* operation timed out, 0 if signaled, time remaining otherwise. |
|
2300 |
*/ |
|
2301 |
clock_t |
|
2302 |
zone_status_timedwait_sig(zone_t *zone, clock_t tim, zone_status_t status) |
|
2303 |
{ |
|
11066
cebb50cbe4f9
PSARC/2009/396 Tickless Kernel Architecture / lbolt decoupling
Rafael Vanoni <rafael.vanoni@sun.com>
parents:
10910
diff
changeset
|
2304 |
clock_t timeleft = tim - ddi_get_lbolt(); |
0 | 2305 |
|
2306 |
ASSERT(status > ZONE_MIN_STATE && status <= ZONE_MAX_STATE); |
|
2307 |
||
2308 |
mutex_enter(&zone_status_lock); |
|
2309 |
while (zone->zone_status < status) { |
|
2310 |
timeleft = cv_timedwait_sig(&zone->zone_cv, &zone_status_lock, |
|
2311 |
tim); |
|
2312 |
if (timeleft <= 0) |
|
2313 |
break; |
|
2314 |
} |
|
2315 |
mutex_exit(&zone_status_lock); |
|
2316 |
return (timeleft); |
|
2317 |
} |
|
2318 |
||
2319 |
/* |
|
2320 |
* Zones have two reference counts: one for references from credential |
|
2321 |
* structures (zone_cred_ref), and one (zone_ref) for everything else. |
|
2322 |
* This is so we can allow a zone to be rebooted while there are still |
|
2323 |
* outstanding cred references, since certain drivers cache dblks (which |
|
2324 |
* implicitly results in cached creds). We wait for zone_ref to drop to |
|
2325 |
* 0 (actually 1), but not zone_cred_ref. The zone structure itself is |
|
2326 |
* later freed when the zone_cred_ref drops to 0, though nothing other |
|
2327 |
* than the zone id and privilege set should be accessed once the zone |
|
2328 |
* is "dead". |
|
2329 |
* |
|
2330 |
* A debugging flag, zone_wait_for_cred, can be set to a non-zero value |
|
2331 |
* to force halt/reboot to block waiting for the zone_cred_ref to drop |
|
2332 |
* to 0. This can be useful to flush out other sources of cached creds |
|
2333 |
* that may be less innocuous than the driver case. |
|
2334 |
*/ |
|
2335 |
||
2336 |
int zone_wait_for_cred = 0; |
|
2337 |
||
2338 |
static void |
|
2339 |
zone_hold_locked(zone_t *z) |
|
2340 |
{ |
|
2341 |
ASSERT(MUTEX_HELD(&z->zone_lock)); |
|
2342 |
z->zone_ref++; |
|
2343 |
ASSERT(z->zone_ref != 0); |
|
2344 |
} |
|
2345 |
||
2346 |
void |
|
2347 |
zone_hold(zone_t *z) |
|
2348 |
{ |
|
2349 |
mutex_enter(&z->zone_lock); |
|
2350 |
zone_hold_locked(z); |
|
2351 |
mutex_exit(&z->zone_lock); |
|
2352 |
} |
|
2353 |
||
2354 |
/* |
|
2355 |
* If the non-cred ref count drops to 1 and either the cred ref count |
|
2356 |
* is 0 or we aren't waiting for cred references, the zone is ready to |
|
2357 |
* be destroyed. |
|
2358 |
*/ |
|
2359 |
#define ZONE_IS_UNREF(zone) ((zone)->zone_ref == 1 && \ |
|
2360 |
(!zone_wait_for_cred || (zone)->zone_cred_ref == 0)) |
|
2361 |
||
2362 |
void |
|
2363 |
zone_rele(zone_t *z) |
|
2364 |
{ |
|
2365 |
boolean_t wakeup; |
|
2366 |
||
2367 |
mutex_enter(&z->zone_lock); |
|
2368 |
ASSERT(z->zone_ref != 0); |
|
2369 |
z->zone_ref--; |
|
2370 |
if (z->zone_ref == 0 && z->zone_cred_ref == 0) { |
|
2371 |
/* no more refs, free the structure */ |
|
2372 |
mutex_exit(&z->zone_lock); |
|
2373 |
zone_free(z); |
|
2374 |
return; |
|
2375 |
} |
|
2376 |
/* signal zone_destroy so the zone can finish halting */ |
|
2377 |
wakeup = (ZONE_IS_UNREF(z) && zone_status_get(z) >= ZONE_IS_DEAD); |
|
2378 |
mutex_exit(&z->zone_lock); |
|
2379 |
||
2380 |
if (wakeup) { |
|
2381 |
/* |
|
2382 |
* Grabbing zonehash_lock here effectively synchronizes with |
|
2383 |
* zone_destroy() to avoid missed signals. |
|
2384 |
*/ |
|
2385 |
mutex_enter(&zonehash_lock); |
|
2386 |
cv_broadcast(&zone_destroy_cv); |
|
2387 |
mutex_exit(&zonehash_lock); |
|
2388 |
} |
|
2389 |
} |
|
2390 |
||
2391 |
void |
|
2392 |
zone_cred_hold(zone_t *z) |
|
2393 |
{ |
|
2394 |
mutex_enter(&z->zone_lock); |
|
2395 |
z->zone_cred_ref++; |
|
2396 |
ASSERT(z->zone_cred_ref != 0); |
|
2397 |
mutex_exit(&z->zone_lock); |
|
2398 |
} |
|
2399 |
||
2400 |
void |
|
2401 |
zone_cred_rele(zone_t *z) |
|
2402 |
{ |
|
2403 |
boolean_t wakeup; |
|
2404 |
||
2405 |
mutex_enter(&z->zone_lock); |
|
2406 |
ASSERT(z->zone_cred_ref != 0); |
|
2407 |
z->zone_cred_ref--; |
|
2408 |
if (z->zone_ref == 0 && z->zone_cred_ref == 0) { |
|
2409 |
/* no more refs, free the structure */ |
|
2410 |
mutex_exit(&z->zone_lock); |
|
2411 |
zone_free(z); |
|
2412 |
return; |
|
2413 |
} |
|
2414 |
/* |
|
2415 |
* If zone_destroy is waiting for the cred references to drain |
|
2416 |
* out, and they have, signal it. |
|
2417 |
*/ |
|
2418 |
wakeup = (zone_wait_for_cred && ZONE_IS_UNREF(z) && |
|
2419 |
zone_status_get(z) >= ZONE_IS_DEAD); |
|
2420 |
mutex_exit(&z->zone_lock); |
|
2421 |
||
2422 |
if (wakeup) { |
|
2423 |
/* |
|
2424 |
* Grabbing zonehash_lock here effectively synchronizes with |
|
2425 |
* zone_destroy() to avoid missed signals. |
|
2426 |
*/ |
|
2427 |
mutex_enter(&zonehash_lock); |
|
2428 |
cv_broadcast(&zone_destroy_cv); |
|
2429 |
mutex_exit(&zonehash_lock); |
|
2430 |
} |
|
2431 |
} |
|
2432 |
||
2433 |
void |
|
2434 |
zone_task_hold(zone_t *z) |
|
2435 |
{ |
|
2436 |
mutex_enter(&z->zone_lock); |
|
2437 |
z->zone_ntasks++; |
|
2438 |
ASSERT(z->zone_ntasks != 0); |
|
2439 |
mutex_exit(&z->zone_lock); |
|
2440 |
} |
|
2441 |
||
2442 |
void |
|
2443 |
zone_task_rele(zone_t *zone) |
|
2444 |
{ |
|
2445 |
uint_t refcnt; |
|
2446 |
||
2447 |
mutex_enter(&zone->zone_lock); |
|
2448 |
ASSERT(zone->zone_ntasks != 0); |
|
2449 |
refcnt = --zone->zone_ntasks; |
|
2450 |
if (refcnt > 1) { /* Common case */ |
|
2451 |
mutex_exit(&zone->zone_lock); |
|
2452 |
return; |
|
2453 |
} |
|
2454 |
zone_hold_locked(zone); /* so we can use the zone_t later */ |
|
2455 |
mutex_exit(&zone->zone_lock); |
|
2456 |
if (refcnt == 1) { |
|
2457 |
/* |
|
2458 |
* See if the zone is shutting down. |
|
2459 |
*/ |
|
2460 |
mutex_enter(&zone_status_lock); |
|
2461 |
if (zone_status_get(zone) != ZONE_IS_SHUTTING_DOWN) { |
|
2462 |
goto out; |
|
2463 |
} |
|
2464 |
||
2465 |
/* |
|
2466 |
* Make sure the ntasks didn't change since we |
|
2467 |
* dropped zone_lock. |
|
2468 |
*/ |
|
2469 |
mutex_enter(&zone->zone_lock); |
|
2470 |
if (refcnt != zone->zone_ntasks) { |
|
2471 |
mutex_exit(&zone->zone_lock); |
|
2472 |
goto out; |
|
2473 |
} |
|
2474 |
mutex_exit(&zone->zone_lock); |
|
2475 |
||
2476 |
/* |
|
2477 |
* No more user processes in the zone. The zone is empty. |
|
2478 |
*/ |
|
2479 |
zone_status_set(zone, ZONE_IS_EMPTY); |
|
2480 |
goto out; |
|
2481 |
} |
|
2482 |
||
2483 |
ASSERT(refcnt == 0); |
|
2484 |
/* |
|
2485 |
* zsched has exited; the zone is dead. |
|
2486 |
*/ |
|
2487 |
zone->zone_zsched = NULL; /* paranoia */ |
|
2488 |
mutex_enter(&zone_status_lock); |
|
2489 |
zone_status_set(zone, ZONE_IS_DEAD); |
|
2490 |
out: |
|
2491 |
mutex_exit(&zone_status_lock); |
|
2492 |
zone_rele(zone); |
|
2493 |
} |
|
2494 |
||
2495 |
zoneid_t |
|
2496 |
getzoneid(void) |
|
2497 |
{ |
|
2498 |
return (curproc->p_zone->zone_id); |
|
2499 |
} |
|
2500 |
||
2501 |
/* |
|
2502 |
* Internal versions of zone_find_by_*(). These don't zone_hold() or |
|
2503 |
* check the validity of a zone's state. |
|
2504 |
*/ |
|
2505 |
static zone_t * |
|
2506 |
zone_find_all_by_id(zoneid_t zoneid) |
|
2507 |
{ |
|
2508 |
mod_hash_val_t hv; |
|
2509 |
zone_t *zone = NULL; |
|
2510 |
||
2511 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
2512 |
||
2513 |
if (mod_hash_find(zonehashbyid, |
|
2514 |
(mod_hash_key_t)(uintptr_t)zoneid, &hv) == 0) |
|
2515 |
zone = (zone_t *)hv; |
|
2516 |
return (zone); |
|
2517 |
} |
|
2518 |
||
2519 |
static zone_t * |
|
1676 | 2520 |
zone_find_all_by_label(const ts_label_t *label) |
2521 |
{ |
|
2522 |
mod_hash_val_t hv; |
|
2523 |
zone_t *zone = NULL; |
|
2524 |
||
2525 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
2526 |
||
2527 |
/* |
|
2528 |
* zonehashbylabel is not maintained for unlabeled systems |
|
2529 |
*/ |
|
2530 |
if (!is_system_labeled()) |
|
2531 |
return (NULL); |
|
2532 |
if (mod_hash_find(zonehashbylabel, (mod_hash_key_t)label, &hv) == 0) |
|
2533 |
zone = (zone_t *)hv; |
|
2534 |
return (zone); |
|
2535 |
} |
|
2536 |
||
2537 |
static zone_t * |
|
0 | 2538 |
zone_find_all_by_name(char *name) |
2539 |
{ |
|
2540 |
mod_hash_val_t hv; |
|
2541 |
zone_t *zone = NULL; |
|
2542 |
||
2543 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
2544 |
||
2545 |
if (mod_hash_find(zonehashbyname, (mod_hash_key_t)name, &hv) == 0) |
|
2546 |
zone = (zone_t *)hv; |
|
2547 |
return (zone); |
|
2548 |
} |
|
2549 |
||
2550 |
/* |
|
2551 |
* Public interface for looking up a zone by zoneid. Only returns the zone if |
|
2552 |
* it is fully initialized, and has not yet begun the zone_destroy() sequence. |
|
2553 |
* Caller must call zone_rele() once it is done with the zone. |
|
2554 |
* |
|
2555 |
* The zone may begin the zone_destroy() sequence immediately after this |
|
2556 |
* function returns, but may be safely used until zone_rele() is called. |
|
2557 |
*/ |
|
2558 |
zone_t * |
|
2559 |
zone_find_by_id(zoneid_t zoneid) |
|
2560 |
{ |
|
2561 |
zone_t *zone; |
|
2562 |
zone_status_t status; |
|
2563 |
||
2564 |
mutex_enter(&zonehash_lock); |
|
2565 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
2566 |
mutex_exit(&zonehash_lock); |
|
2567 |
return (NULL); |
|
2568 |
} |
|
2569 |
status = zone_status_get(zone); |
|
2570 |
if (status < ZONE_IS_READY || status > ZONE_IS_DOWN) { |
|
2571 |
/* |
|
2572 |
* For all practical purposes the zone doesn't exist. |
|
2573 |
*/ |
|
2574 |
mutex_exit(&zonehash_lock); |
|
2575 |
return (NULL); |
|
2576 |
} |
|
2577 |
zone_hold(zone); |
|
2578 |
mutex_exit(&zonehash_lock); |
|
2579 |
return (zone); |
|
2580 |
} |
|
2581 |
||
2582 |
/* |
|
1676 | 2583 |
* Similar to zone_find_by_id, but using zone label as the key. |
2584 |
*/ |
|
2585 |
zone_t * |
|
2586 |
zone_find_by_label(const ts_label_t *label) |
|
2587 |
{ |
|
2588 |
zone_t *zone; |
|
2110
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2589 |
zone_status_t status; |
1676 | 2590 |
|
2591 |
mutex_enter(&zonehash_lock); |
|
2592 |
if ((zone = zone_find_all_by_label(label)) == NULL) { |
|
2593 |
mutex_exit(&zonehash_lock); |
|
2594 |
return (NULL); |
|
2595 |
} |
|
2110
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2596 |
|
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2597 |
status = zone_status_get(zone); |
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2598 |
if (status > ZONE_IS_DOWN) { |
1676 | 2599 |
/* |
2600 |
* For all practical purposes the zone doesn't exist. |
|
2601 |
*/ |
|
2110
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2602 |
mutex_exit(&zonehash_lock); |
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2603 |
return (NULL); |
1676 | 2604 |
} |
2110
31cba59b38be
6403267 address remaining issues raised during TX code reviews
rica
parents:
1996
diff
changeset
|
2605 |
zone_hold(zone); |
1676 | 2606 |
mutex_exit(&zonehash_lock); |
2607 |
return (zone); |
|
2608 |
} |
|
2609 |
||
2610 |
/* |
|
0 | 2611 |
* Similar to zone_find_by_id, but using zone name as the key. |
2612 |
*/ |
|
2613 |
zone_t * |
|
2614 |
zone_find_by_name(char *name) |
|
2615 |
{ |
|
2616 |
zone_t *zone; |
|
2617 |
zone_status_t status; |
|
2618 |
||
2619 |
mutex_enter(&zonehash_lock); |
|
2620 |
if ((zone = zone_find_all_by_name(name)) == NULL) { |
|
2621 |
mutex_exit(&zonehash_lock); |
|
2622 |
return (NULL); |
|
2623 |
} |
|
2624 |
status = zone_status_get(zone); |
|
2625 |
if (status < ZONE_IS_READY || status > ZONE_IS_DOWN) { |
|
2626 |
/* |
|
2627 |
* For all practical purposes the zone doesn't exist. |
|
2628 |
*/ |
|
2629 |
mutex_exit(&zonehash_lock); |
|
2630 |
return (NULL); |
|
2631 |
} |
|
2632 |
zone_hold(zone); |
|
2633 |
mutex_exit(&zonehash_lock); |
|
2634 |
return (zone); |
|
2635 |
} |
|
2636 |
||
2637 |
/* |
|
2638 |
* Similar to zone_find_by_id(), using the path as a key. For instance, |
|
2639 |
* if there is a zone "foo" rooted at /foo/root, and the path argument |
|
2640 |
* is "/foo/root/proc", it will return the held zone_t corresponding to |
|
2641 |
* zone "foo". |
|
2642 |
* |
|
2643 |
* zone_find_by_path() always returns a non-NULL value, since at the |
|
2644 |
* very least every path will be contained in the global zone. |
|
2645 |
* |
|
2646 |
* As with the other zone_find_by_*() functions, the caller is |
|
2647 |
* responsible for zone_rele()ing the return value of this function. |
|
2648 |
*/ |
|
2649 |
zone_t * |
|
2650 |
zone_find_by_path(const char *path) |
|
2651 |
{ |
|
2652 |
zone_t *zone; |
|
2653 |
zone_t *zret = NULL; |
|
2654 |
zone_status_t status; |
|
2655 |
||
2656 |
if (path == NULL) { |
|
2657 |
/* |
|
2658 |
* Call from rootconf(). |
|
2659 |
*/ |
|
2660 |
zone_hold(global_zone); |
|
2661 |
return (global_zone); |
|
2662 |
} |
|
2663 |
ASSERT(*path == '/'); |
|
2664 |
mutex_enter(&zonehash_lock); |
|
2665 |
for (zone = list_head(&zone_active); zone != NULL; |
|
2666 |
zone = list_next(&zone_active, zone)) { |
|
2667 |
if (ZONE_PATH_VISIBLE(path, zone)) |
|
2668 |
zret = zone; |
|
2669 |
} |
|
2670 |
ASSERT(zret != NULL); |
|
2671 |
status = zone_status_get(zret); |
|
2672 |
if (status < ZONE_IS_READY || status > ZONE_IS_DOWN) { |
|
2673 |
/* |
|
2674 |
* Zone practically doesn't exist. |
|
2675 |
*/ |
|
2676 |
zret = global_zone; |
|
2677 |
} |
|
2678 |
zone_hold(zret); |
|
2679 |
mutex_exit(&zonehash_lock); |
|
2680 |
return (zret); |
|
2681 |
} |
|
2682 |
||
2683 |
/* |
|
2684 |
* Get the number of cpus visible to this zone. The system-wide global |
|
2685 |
* 'ncpus' is returned if pools are disabled, the caller is in the |
|
2686 |
* global zone, or a NULL zone argument is passed in. |
|
2687 |
*/ |
|
2688 |
int |
|
2689 |
zone_ncpus_get(zone_t *zone) |
|
2690 |
{ |
|
2691 |
int myncpus = zone == NULL ? 0 : zone->zone_ncpus; |
|
2692 |
||
2693 |
return (myncpus != 0 ? myncpus : ncpus); |
|
2694 |
} |
|
2695 |
||
2696 |
/* |
|
2697 |
* Get the number of online cpus visible to this zone. The system-wide |
|
2698 |
* global 'ncpus_online' is returned if pools are disabled, the caller |
|
2699 |
* is in the global zone, or a NULL zone argument is passed in. |
|
2700 |
*/ |
|
2701 |
int |
|
2702 |
zone_ncpus_online_get(zone_t *zone) |
|
2703 |
{ |
|
2704 |
int myncpus_online = zone == NULL ? 0 : zone->zone_ncpus_online; |
|
2705 |
||
2706 |
return (myncpus_online != 0 ? myncpus_online : ncpus_online); |
|
2707 |
} |
|
2708 |
||
2709 |
/* |
|
2710 |
* Return the pool to which the zone is currently bound. |
|
2711 |
*/ |
|
2712 |
pool_t * |
|
2713 |
zone_pool_get(zone_t *zone) |
|
2714 |
{ |
|
2715 |
ASSERT(pool_lock_held()); |
|
2716 |
||
2717 |
return (zone->zone_pool); |
|
2718 |
} |
|
2719 |
||
2720 |
/* |
|
2721 |
* Set the zone's pool pointer and update the zone's visibility to match |
|
2722 |
* the resources in the new pool. |
|
2723 |
*/ |
|
2724 |
void |
|
2725 |
zone_pool_set(zone_t *zone, pool_t *pool) |
|
2726 |
{ |
|
2727 |
ASSERT(pool_lock_held()); |
|
2728 |
ASSERT(MUTEX_HELD(&cpu_lock)); |
|
2729 |
||
2730 |
zone->zone_pool = pool; |
|
2731 |
zone_pset_set(zone, pool->pool_pset->pset_id); |
|
2732 |
} |
|
2733 |
||
2734 |
/* |
|
2735 |
* Return the cached value of the id of the processor set to which the |
|
2736 |
* zone is currently bound. The value will be ZONE_PS_INVAL if the pools |
|
2737 |
* facility is disabled. |
|
2738 |
*/ |
|
2739 |
psetid_t |
|
2740 |
zone_pset_get(zone_t *zone) |
|
2741 |
{ |
|
2742 |
ASSERT(MUTEX_HELD(&cpu_lock)); |
|
2743 |
||
2744 |
return (zone->zone_psetid); |
|
2745 |
} |
|
2746 |
||
2747 |
/* |
|
2748 |
* Set the cached value of the id of the processor set to which the zone |
|
2749 |
* is currently bound. Also update the zone's visibility to match the |
|
2750 |
* resources in the new processor set. |
|
2751 |
*/ |
|
2752 |
void |
|
2753 |
zone_pset_set(zone_t *zone, psetid_t newpsetid) |
|
2754 |
{ |
|
2755 |
psetid_t oldpsetid; |
|
2756 |
||
2757 |
ASSERT(MUTEX_HELD(&cpu_lock)); |
|
2758 |
oldpsetid = zone_pset_get(zone); |
|
2759 |
||
2760 |
if (oldpsetid == newpsetid) |
|
2761 |
return; |
|
2762 |
/* |
|
2763 |
* Global zone sees all. |
|
2764 |
*/ |
|
2765 |
if (zone != global_zone) { |
|
2766 |
zone->zone_psetid = newpsetid; |
|
2767 |
if (newpsetid != ZONE_PS_INVAL) |
|
2768 |
pool_pset_visibility_add(newpsetid, zone); |
|
2769 |
if (oldpsetid != ZONE_PS_INVAL) |
|
2770 |
pool_pset_visibility_remove(oldpsetid, zone); |
|
2771 |
} |
|
2772 |
/* |
|
2773 |
* Disabling pools, so we should start using the global values |
|
2774 |
* for ncpus and ncpus_online. |
|
2775 |
*/ |
|
2776 |
if (newpsetid == ZONE_PS_INVAL) { |
|
2777 |
zone->zone_ncpus = 0; |
|
2778 |
zone->zone_ncpus_online = 0; |
|
2779 |
} |
|
2780 |
} |
|
2781 |
||
2782 |
/* |
|
2783 |
* Walk the list of active zones and issue the provided callback for |
|
2784 |
* each of them. |
|
2785 |
* |
|
2786 |
* Caller must not be holding any locks that may be acquired under |
|
2787 |
* zonehash_lock. See comment at the beginning of the file for a list of |
|
2788 |
* common locks and their interactions with zones. |
|
2789 |
*/ |
|
2790 |
int |
|
2791 |
zone_walk(int (*cb)(zone_t *, void *), void *data) |
|
2792 |
{ |
|
2793 |
zone_t *zone; |
|
2794 |
int ret = 0; |
|
2795 |
zone_status_t status; |
|
2796 |
||
2797 |
mutex_enter(&zonehash_lock); |
|
2798 |
for (zone = list_head(&zone_active); zone != NULL; |
|
2799 |
zone = list_next(&zone_active, zone)) { |
|
2800 |
/* |
|
2801 |
* Skip zones that shouldn't be externally visible. |
|
2802 |
*/ |
|
2803 |
status = zone_status_get(zone); |
|
2804 |
if (status < ZONE_IS_READY || status > ZONE_IS_DOWN) |
|
2805 |
continue; |
|
2806 |
/* |
|
2807 |
* Bail immediately if any callback invocation returns a |
|
2808 |
* non-zero value. |
|
2809 |
*/ |
|
2810 |
ret = (*cb)(zone, data); |
|
2811 |
if (ret != 0) |
|
2812 |
break; |
|
2813 |
} |
|
2814 |
mutex_exit(&zonehash_lock); |
|
2815 |
return (ret); |
|
2816 |
} |
|
2817 |
||
2818 |
static int |
|
2819 |
zone_set_root(zone_t *zone, const char *upath) |
|
2820 |
{ |
|
2821 |
vnode_t *vp; |
|
2822 |
int trycount; |
|
2823 |
int error = 0; |
|
2824 |
char *path; |
|
2825 |
struct pathname upn, pn; |
|
2826 |
size_t pathlen; |
|
2827 |
||
2828 |
if ((error = pn_get((char *)upath, UIO_USERSPACE, &upn)) != 0) |
|
2829 |
return (error); |
|
2830 |
||
2831 |
pn_alloc(&pn); |
|
2832 |
||
2833 |
/* prevent infinite loop */ |
|
2834 |
trycount = 10; |
|
2835 |
for (;;) { |
|
2836 |
if (--trycount <= 0) { |
|
2837 |
error = ESTALE; |
|
2838 |
goto out; |
|
2839 |
} |
|
2840 |
||
2841 |
if ((error = lookuppn(&upn, &pn, FOLLOW, NULLVPP, &vp)) == 0) { |
|
2842 |
/* |
|
2843 |
* VOP_ACCESS() may cover 'vp' with a new |
|
2844 |
* filesystem, if 'vp' is an autoFS vnode. |
|
2845 |
* Get the new 'vp' if so. |
|
2846 |
*/ |
|
5331 | 2847 |
if ((error = |
2848 |
VOP_ACCESS(vp, VEXEC, 0, CRED(), NULL)) == 0 && |
|
4417
01a30d05049f
6481274 zone_set_root() should not directly access private vnode field vp->v_vfsmountedhere
eh208807
parents:
4246
diff
changeset
|
2849 |
(!vn_ismntpt(vp) || |
0 | 2850 |
(error = traverse(&vp)) == 0)) { |
2851 |
pathlen = pn.pn_pathlen + 2; |
|
2852 |
path = kmem_alloc(pathlen, KM_SLEEP); |
|
2853 |
(void) strncpy(path, pn.pn_path, |
|
2854 |
pn.pn_pathlen + 1); |
|
2855 |
path[pathlen - 2] = '/'; |
|
2856 |
path[pathlen - 1] = '\0'; |
|
2857 |
pn_free(&pn); |
|
2858 |
pn_free(&upn); |
|
2859 |
||
2860 |
/* Success! */ |
|
2861 |
break; |
|
2862 |
} |
|
2863 |
VN_RELE(vp); |
|
2864 |
} |
|
2865 |
if (error != ESTALE) |
|
2866 |
goto out; |
|
2867 |
} |
|
2868 |
||
2869 |
ASSERT(error == 0); |
|
2870 |
zone->zone_rootvp = vp; /* we hold a reference to vp */ |
|
2871 |
zone->zone_rootpath = path; |
|
2872 |
zone->zone_rootpathlen = pathlen; |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
2873 |
if (pathlen > 5 && strcmp(path + pathlen - 5, "/lu/") == 0) |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
2874 |
zone->zone_flags |= ZF_IS_SCRATCH; |
0 | 2875 |
return (0); |
2876 |
||
2877 |
out: |
|
2878 |
pn_free(&pn); |
|
2879 |
pn_free(&upn); |
|
2880 |
return (error); |
|
2881 |
} |
|
2882 |
||
2883 |
#define isalnum(c) (((c) >= '0' && (c) <= '9') || \ |
|
2884 |
((c) >= 'a' && (c) <= 'z') || \ |
|
2885 |
((c) >= 'A' && (c) <= 'Z')) |
|
2886 |
||
2887 |
static int |
|
2888 |
zone_set_name(zone_t *zone, const char *uname) |
|
2889 |
{ |
|
2890 |
char *kname = kmem_zalloc(ZONENAME_MAX, KM_SLEEP); |
|
2891 |
size_t len; |
|
2892 |
int i, err; |
|
2893 |
||
2894 |
if ((err = copyinstr(uname, kname, ZONENAME_MAX, &len)) != 0) { |
|
2895 |
kmem_free(kname, ZONENAME_MAX); |
|
2896 |
return (err); /* EFAULT or ENAMETOOLONG */ |
|
2897 |
} |
|
2898 |
||
2899 |
/* must be less than ZONENAME_MAX */ |
|
2900 |
if (len == ZONENAME_MAX && kname[ZONENAME_MAX - 1] != '\0') { |
|
2901 |
kmem_free(kname, ZONENAME_MAX); |
|
2902 |
return (EINVAL); |
|
2903 |
} |
|
2904 |
||
2905 |
/* |
|
2906 |
* Name must start with an alphanumeric and must contain only |
|
2907 |
* alphanumerics, '-', '_' and '.'. |
|
2908 |
*/ |
|
2909 |
if (!isalnum(kname[0])) { |
|
2910 |
kmem_free(kname, ZONENAME_MAX); |
|
2911 |
return (EINVAL); |
|
2912 |
} |
|
2913 |
for (i = 1; i < len - 1; i++) { |
|
2914 |
if (!isalnum(kname[i]) && kname[i] != '-' && kname[i] != '_' && |
|
2915 |
kname[i] != '.') { |
|
2916 |
kmem_free(kname, ZONENAME_MAX); |
|
2917 |
return (EINVAL); |
|
2918 |
} |
|
2919 |
} |
|
2920 |
||
2921 |
zone->zone_name = kname; |
|
2922 |
return (0); |
|
2923 |
} |
|
2924 |
||
2925 |
/* |
|
8662
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2926 |
* Gets the 32-bit hostid of the specified zone as an unsigned int. If 'zonep' |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2927 |
* is NULL or it points to a zone with no hostid emulation, then the machine's |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2928 |
* hostid (i.e., the global zone's hostid) is returned. This function returns |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2929 |
* zero if neither the zone nor the host machine (global zone) have hostids. It |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2930 |
* returns HW_INVALID_HOSTID if the function attempts to return the machine's |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2931 |
* hostid and the machine's hostid is invalid. |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2932 |
*/ |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2933 |
uint32_t |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2934 |
zone_get_hostid(zone_t *zonep) |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2935 |
{ |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2936 |
unsigned long machine_hostid; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2937 |
|
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2938 |
if (zonep == NULL || zonep->zone_hostid == HW_INVALID_HOSTID) { |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2939 |
if (ddi_strtoul(hw_serial, NULL, 10, &machine_hostid) != 0) |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2940 |
return (HW_INVALID_HOSTID); |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2941 |
return ((uint32_t)machine_hostid); |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2942 |
} |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2943 |
return (zonep->zone_hostid); |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2944 |
} |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2945 |
|
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
2946 |
/* |
0 | 2947 |
* Similar to thread_create(), but makes sure the thread is in the appropriate |
2948 |
* zone's zsched process (curproc->p_zone->zone_zsched) before returning. |
|
2949 |
*/ |
|
2950 |
/*ARGSUSED*/ |
|
2951 |
kthread_t * |
|
2952 |
zthread_create( |
|
2953 |
caddr_t stk, |
|
2954 |
size_t stksize, |
|
2955 |
void (*proc)(), |
|
2956 |
void *arg, |
|
2957 |
size_t len, |
|
2958 |
pri_t pri) |
|
2959 |
{ |
|
2960 |
kthread_t *t; |
|
2961 |
zone_t *zone = curproc->p_zone; |
|
2962 |
proc_t *pp = zone->zone_zsched; |
|
2963 |
||
2964 |
zone_hold(zone); /* Reference to be dropped when thread exits */ |
|
2965 |
||
2966 |
/* |
|
2967 |
* No-one should be trying to create threads if the zone is shutting |
|
2968 |
* down and there aren't any kernel threads around. See comment |
|
2969 |
* in zthread_exit(). |
|
2970 |
*/ |
|
2971 |
ASSERT(!(zone->zone_kthreads == NULL && |
|
2972 |
zone_status_get(zone) >= ZONE_IS_EMPTY)); |
|
2973 |
/* |
|
2974 |
* Create a thread, but don't let it run until we've finished setting |
|
2975 |
* things up. |
|
2976 |
*/ |
|
2977 |
t = thread_create(stk, stksize, proc, arg, len, pp, TS_STOPPED, pri); |
|
2978 |
ASSERT(t->t_forw == NULL); |
|
2979 |
mutex_enter(&zone_status_lock); |
|
2980 |
if (zone->zone_kthreads == NULL) { |
|
2981 |
t->t_forw = t->t_back = t; |
|
2982 |
} else { |
|
2983 |
kthread_t *tx = zone->zone_kthreads; |
|
2984 |
||
2985 |
t->t_forw = tx; |
|
2986 |
t->t_back = tx->t_back; |
|
2987 |
tx->t_back->t_forw = t; |
|
2988 |
tx->t_back = t; |
|
2989 |
} |
|
2990 |
zone->zone_kthreads = t; |
|
2991 |
mutex_exit(&zone_status_lock); |
|
2992 |
||
2993 |
mutex_enter(&pp->p_lock); |
|
2994 |
t->t_proc_flag |= TP_ZTHREAD; |
|
2995 |
project_rele(t->t_proj); |
|
2996 |
t->t_proj = project_hold(pp->p_task->tk_proj); |
|
2997 |
||
2998 |
/* |
|
2999 |
* Setup complete, let it run. |
|
3000 |
*/ |
|
3001 |
thread_lock(t); |
|
3002 |
t->t_schedflag |= TS_ALLSTART; |
|
3003 |
setrun_locked(t); |
|
3004 |
thread_unlock(t); |
|
3005 |
||
3006 |
mutex_exit(&pp->p_lock); |
|
3007 |
||
3008 |
return (t); |
|
3009 |
} |
|
3010 |
||
3011 |
/* |
|
3012 |
* Similar to thread_exit(). Must be called by threads created via |
|
3013 |
* zthread_exit(). |
|
3014 |
*/ |
|
3015 |
void |
|
3016 |
zthread_exit(void) |
|
3017 |
{ |
|
3018 |
kthread_t *t = curthread; |
|
3019 |
proc_t *pp = curproc; |
|
3020 |
zone_t *zone = pp->p_zone; |
|
3021 |
||
3022 |
mutex_enter(&zone_status_lock); |
|
3023 |
||
3024 |
/* |
|
3025 |
* Reparent to p0 |
|
3026 |
*/ |
|
1075
5ef61094f66a
6355953 assertion failed: cpu == CPU, file: ../../i86pc/vm/hat_i86.c, line: 925
josephb
parents:
816
diff
changeset
|
3027 |
kpreempt_disable(); |
0 | 3028 |
mutex_enter(&pp->p_lock); |
3029 |
t->t_proc_flag &= ~TP_ZTHREAD; |
|
3030 |
t->t_procp = &p0; |
|
3031 |
hat_thread_exit(t); |
|
3032 |
mutex_exit(&pp->p_lock); |
|
1075
5ef61094f66a
6355953 assertion failed: cpu == CPU, file: ../../i86pc/vm/hat_i86.c, line: 925
josephb
parents:
816
diff
changeset
|
3033 |
kpreempt_enable(); |
0 | 3034 |
|
3035 |
if (t->t_back == t) { |
|
3036 |
ASSERT(t->t_forw == t); |
|
3037 |
/* |
|
3038 |
* If the zone is empty, once the thread count |
|
3039 |
* goes to zero no further kernel threads can be |
|
3040 |
* created. This is because if the creator is a process |
|
3041 |
* in the zone, then it must have exited before the zone |
|
3042 |
* state could be set to ZONE_IS_EMPTY. |
|
3043 |
* Otherwise, if the creator is a kernel thread in the |
|
3044 |
* zone, the thread count is non-zero. |
|
3045 |
* |
|
3046 |
* This really means that non-zone kernel threads should |
|
3047 |
* not create zone kernel threads. |
|
3048 |
*/ |
|
3049 |
zone->zone_kthreads = NULL; |
|
3050 |
if (zone_status_get(zone) == ZONE_IS_EMPTY) { |
|
3051 |
zone_status_set(zone, ZONE_IS_DOWN); |
|
3792 | 3052 |
/* |
3053 |
* Remove any CPU caps on this zone. |
|
3054 |
*/ |
|
3055 |
cpucaps_zone_remove(zone); |
|
0 | 3056 |
} |
3057 |
} else { |
|
3058 |
t->t_forw->t_back = t->t_back; |
|
3059 |
t->t_back->t_forw = t->t_forw; |
|
3060 |
if (zone->zone_kthreads == t) |
|
3061 |
zone->zone_kthreads = t->t_forw; |
|
3062 |
} |
|
3063 |
mutex_exit(&zone_status_lock); |
|
3064 |
zone_rele(zone); |
|
3065 |
thread_exit(); |
|
3066 |
/* NOTREACHED */ |
|
3067 |
} |
|
3068 |
||
3069 |
static void |
|
3070 |
zone_chdir(vnode_t *vp, vnode_t **vpp, proc_t *pp) |
|
3071 |
{ |
|
3072 |
vnode_t *oldvp; |
|
3073 |
||
3074 |
/* we're going to hold a reference here to the directory */ |
|
3075 |
VN_HOLD(vp); |
|
3076 |
||
11861
a63258283f8f
PSARC/2009/354 Always on / no reboot Solaris Audit
Marek Pospisil <Marek.Pospisil@Sun.COM>
parents:
11850
diff
changeset
|
3077 |
/* update abs cwd/root path see c2/audit.c */ |
a63258283f8f
PSARC/2009/354 Always on / no reboot Solaris Audit
Marek Pospisil <Marek.Pospisil@Sun.COM>
parents:
11850
diff
changeset
|
3078 |
if (AU_AUDITING()) |
0 | 3079 |
audit_chdirec(vp, vpp); |
3080 |
||
3081 |
mutex_enter(&pp->p_lock); |
|
3082 |
oldvp = *vpp; |
|
3083 |
*vpp = vp; |
|
3084 |
mutex_exit(&pp->p_lock); |
|
3085 |
if (oldvp != NULL) |
|
3086 |
VN_RELE(oldvp); |
|
3087 |
} |
|
3088 |
||
3089 |
/* |
|
3090 |
* Convert an rctl value represented by an nvlist_t into an rctl_val_t. |
|
3091 |
*/ |
|
3092 |
static int |
|
3093 |
nvlist2rctlval(nvlist_t *nvl, rctl_val_t *rv) |
|
3094 |
{ |
|
3095 |
nvpair_t *nvp = NULL; |
|
3096 |
boolean_t priv_set = B_FALSE; |
|
3097 |
boolean_t limit_set = B_FALSE; |
|
3098 |
boolean_t action_set = B_FALSE; |
|
3099 |
||
3100 |
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { |
|
3101 |
const char *name; |
|
3102 |
uint64_t ui64; |
|
3103 |
||
3104 |
name = nvpair_name(nvp); |
|
3105 |
if (nvpair_type(nvp) != DATA_TYPE_UINT64) |
|
3106 |
return (EINVAL); |
|
3107 |
(void) nvpair_value_uint64(nvp, &ui64); |
|
3108 |
if (strcmp(name, "privilege") == 0) { |
|
3109 |
/* |
|
3110 |
* Currently only privileged values are allowed, but |
|
3111 |
* this may change in the future. |
|
3112 |
*/ |
|
3113 |
if (ui64 != RCPRIV_PRIVILEGED) |
|
3114 |
return (EINVAL); |
|
3115 |
rv->rcv_privilege = ui64; |
|
3116 |
priv_set = B_TRUE; |
|
3117 |
} else if (strcmp(name, "limit") == 0) { |
|
3118 |
rv->rcv_value = ui64; |
|
3119 |
limit_set = B_TRUE; |
|
3120 |
} else if (strcmp(name, "action") == 0) { |
|
3121 |
if (ui64 != RCTL_LOCAL_NOACTION && |
|
3122 |
ui64 != RCTL_LOCAL_DENY) |
|
3123 |
return (EINVAL); |
|
3124 |
rv->rcv_flagaction = ui64; |
|
3125 |
action_set = B_TRUE; |
|
3126 |
} else { |
|
3127 |
return (EINVAL); |
|
3128 |
} |
|
3129 |
} |
|
3130 |
||
3131 |
if (!(priv_set && limit_set && action_set)) |
|
3132 |
return (EINVAL); |
|
3133 |
rv->rcv_action_signal = 0; |
|
3134 |
rv->rcv_action_recipient = NULL; |
|
3135 |
rv->rcv_action_recip_pid = -1; |
|
3136 |
rv->rcv_firing_time = 0; |
|
3137 |
||
3138 |
return (0); |
|
3139 |
} |
|
3140 |
||
2267 | 3141 |
/* |
3142 |
* Non-global zone version of start_init. |
|
3143 |
*/ |
|
0 | 3144 |
void |
2267 | 3145 |
zone_start_init(void) |
0 | 3146 |
{ |
3147 |
proc_t *p = ttoproc(curthread); |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3148 |
zone_t *z = p->p_zone; |
2267 | 3149 |
|
3150 |
ASSERT(!INGLOBALZONE(curproc)); |
|
0 | 3151 |
|
3152 |
/* |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3153 |
* For all purposes (ZONE_ATTR_INITPID and restart_init), |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3154 |
* storing just the pid of init is sufficient. |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3155 |
*/ |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3156 |
z->zone_proc_initpid = p->p_pid; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3157 |
|
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3158 |
/* |
2267 | 3159 |
* We maintain zone_boot_err so that we can return the cause of the |
3160 |
* failure back to the caller of the zone_boot syscall. |
|
0 | 3161 |
*/ |
2267 | 3162 |
p->p_zone->zone_boot_err = start_init_common(); |
0 | 3163 |
|
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3164 |
/* |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3165 |
* We will prevent booting zones from becoming running zones if the |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3166 |
* global zone is shutting down. |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3167 |
*/ |
0 | 3168 |
mutex_enter(&zone_status_lock); |
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3169 |
if (z->zone_boot_err != 0 || zone_status_get(global_zone) >= |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
3170 |
ZONE_IS_SHUTTING_DOWN) { |
0 | 3171 |
/* |
3172 |
* Make sure we are still in the booting state-- we could have |
|
3173 |
* raced and already be shutting down, or even further along. |
|
3174 |
*/ |
|
3792 | 3175 |
if (zone_status_get(z) == ZONE_IS_BOOTING) { |
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3176 |
zone_status_set(z, ZONE_IS_SHUTTING_DOWN); |
3792 | 3177 |
} |
0 | 3178 |
mutex_exit(&zone_status_lock); |
3179 |
/* It's gone bad, dispose of the process */ |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3180 |
if (proc_exit(CLD_EXITED, z->zone_boot_err) != 0) { |
390 | 3181 |
mutex_enter(&p->p_lock); |
3182 |
ASSERT(p->p_flag & SEXITLWPS); |
|
0 | 3183 |
lwp_exit(); |
3184 |
} |
|
3185 |
} else { |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3186 |
if (zone_status_get(z) == ZONE_IS_BOOTING) |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3187 |
zone_status_set(z, ZONE_IS_RUNNING); |
0 | 3188 |
mutex_exit(&zone_status_lock); |
3189 |
/* cause the process to return to userland. */ |
|
3190 |
lwp_rtt(); |
|
3191 |
} |
|
3192 |
} |
|
3193 |
||
3194 |
struct zsched_arg { |
|
3195 |
zone_t *zone; |
|
3196 |
nvlist_t *nvlist; |
|
3197 |
}; |
|
3198 |
||
3199 |
/* |
|
3200 |
* Per-zone "sched" workalike. The similarity to "sched" doesn't have |
|
3201 |
* anything to do with scheduling, but rather with the fact that |
|
3202 |
* per-zone kernel threads are parented to zsched, just like regular |
|
3203 |
* kernel threads are parented to sched (p0). |
|
3204 |
* |
|
3205 |
* zsched is also responsible for launching init for the zone. |
|
3206 |
*/ |
|
3207 |
static void |
|
3208 |
zsched(void *arg) |
|
3209 |
{ |
|
3210 |
struct zsched_arg *za = arg; |
|
3211 |
proc_t *pp = curproc; |
|
3212 |
proc_t *initp = proc_init; |
|
3213 |
zone_t *zone = za->zone; |
|
3214 |
cred_t *cr, *oldcred; |
|
3215 |
rctl_set_t *set; |
|
3216 |
rctl_alloc_gp_t *gp; |
|
3217 |
contract_t *ct = NULL; |
|
3218 |
task_t *tk, *oldtk; |
|
3219 |
rctl_entity_p_t e; |
|
3220 |
kproject_t *pj; |
|
3221 |
||
3222 |
nvlist_t *nvl = za->nvlist; |
|
3223 |
nvpair_t *nvp = NULL; |
|
3224 |
||
3446 | 3225 |
bcopy("zsched", PTOU(pp)->u_psargs, sizeof ("zsched")); |
3226 |
bcopy("zsched", PTOU(pp)->u_comm, sizeof ("zsched")); |
|
3227 |
PTOU(pp)->u_argc = 0; |
|
3228 |
PTOU(pp)->u_argv = NULL; |
|
3229 |
PTOU(pp)->u_envp = NULL; |
|
0 | 3230 |
closeall(P_FINFO(pp)); |
3231 |
||
3232 |
/* |
|
3233 |
* We are this zone's "zsched" process. As the zone isn't generally |
|
3234 |
* visible yet we don't need to grab any locks before initializing its |
|
3235 |
* zone_proc pointer. |
|
3236 |
*/ |
|
3237 |
zone_hold(zone); /* this hold is released by zone_destroy() */ |
|
3238 |
zone->zone_zsched = pp; |
|
3239 |
mutex_enter(&pp->p_lock); |
|
3240 |
pp->p_zone = zone; |
|
3241 |
mutex_exit(&pp->p_lock); |
|
3242 |
||
3243 |
/* |
|
3244 |
* Disassociate process from its 'parent'; parent ourselves to init |
|
3245 |
* (pid 1) and change other values as needed. |
|
3246 |
*/ |
|
3247 |
sess_create(); |
|
3248 |
||
3249 |
mutex_enter(&pidlock); |
|
3250 |
proc_detach(pp); |
|
3251 |
pp->p_ppid = 1; |
|
3252 |
pp->p_flag |= SZONETOP; |
|
3253 |
pp->p_ancpid = 1; |
|
3254 |
pp->p_parent = initp; |
|
3255 |
pp->p_psibling = NULL; |
|
3256 |
if (initp->p_child) |
|
3257 |
initp->p_child->p_psibling = pp; |
|
3258 |
pp->p_sibling = initp->p_child; |
|
3259 |
initp->p_child = pp; |
|
3260 |
||
3261 |
/* Decrement what newproc() incremented. */ |
|
3262 |
upcount_dec(crgetruid(CRED()), GLOBAL_ZONEID); |
|
3263 |
/* |
|
3264 |
* Our credentials are about to become kcred-like, so we don't care |
|
3265 |
* about the caller's ruid. |
|
3266 |
*/ |
|
3267 |
upcount_inc(crgetruid(kcred), zone->zone_id); |
|
3268 |
mutex_exit(&pidlock); |
|
3269 |
||
3270 |
/* |
|
3271 |
* getting out of global zone, so decrement lwp counts |
|
3272 |
*/ |
|
3273 |
pj = pp->p_task->tk_proj; |
|
3274 |
mutex_enter(&global_zone->zone_nlwps_lock); |
|
3275 |
pj->kpj_nlwps -= pp->p_lwpcnt; |
|
3276 |
global_zone->zone_nlwps -= pp->p_lwpcnt; |
|
3277 |
mutex_exit(&global_zone->zone_nlwps_lock); |
|
3278 |
||
3279 |
/* |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3280 |
* Decrement locked memory counts on old zone and project. |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3281 |
*/ |
3247 | 3282 |
mutex_enter(&global_zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3283 |
global_zone->zone_locked_mem -= pp->p_locked_mem; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3284 |
pj->kpj_data.kpd_locked_mem -= pp->p_locked_mem; |
3247 | 3285 |
mutex_exit(&global_zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3286 |
|
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3287 |
/* |
0 | 3288 |
* Create and join a new task in project '0' of this zone. |
3289 |
* |
|
3290 |
* We don't need to call holdlwps() since we know we're the only lwp in |
|
3291 |
* this process. |
|
3292 |
* |
|
3293 |
* task_join() returns with p_lock held. |
|
3294 |
*/ |
|
3295 |
tk = task_create(0, zone); |
|
3296 |
mutex_enter(&cpu_lock); |
|
3297 |
oldtk = task_join(tk, 0); |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3298 |
|
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3299 |
pj = pp->p_task->tk_proj; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3300 |
|
3247 | 3301 |
mutex_enter(&zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3302 |
zone->zone_locked_mem += pp->p_locked_mem; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3303 |
pj->kpj_data.kpd_locked_mem += pp->p_locked_mem; |
3247 | 3304 |
mutex_exit(&zone->zone_mem_lock); |
0 | 3305 |
|
3306 |
/* |
|
3307 |
* add lwp counts to zsched's zone, and increment project's task count |
|
3308 |
* due to the task created in the above tasksys_settaskid |
|
3309 |
*/ |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3310 |
|
0 | 3311 |
mutex_enter(&zone->zone_nlwps_lock); |
3312 |
pj->kpj_nlwps += pp->p_lwpcnt; |
|
3313 |
pj->kpj_ntasks += 1; |
|
3314 |
zone->zone_nlwps += pp->p_lwpcnt; |
|
3315 |
mutex_exit(&zone->zone_nlwps_lock); |
|
3316 |
||
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3317 |
mutex_exit(&curproc->p_lock); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3318 |
mutex_exit(&cpu_lock); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3319 |
task_rele(oldtk); |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3320 |
|
0 | 3321 |
/* |
3322 |
* The process was created by a process in the global zone, hence the |
|
3323 |
* credentials are wrong. We might as well have kcred-ish credentials. |
|
3324 |
*/ |
|
3325 |
cr = zone->zone_kcred; |
|
3326 |
crhold(cr); |
|
3327 |
mutex_enter(&pp->p_crlock); |
|
3328 |
oldcred = pp->p_cred; |
|
3329 |
pp->p_cred = cr; |
|
3330 |
mutex_exit(&pp->p_crlock); |
|
3331 |
crfree(oldcred); |
|
3332 |
||
3333 |
/* |
|
3334 |
* Hold credentials again (for thread) |
|
3335 |
*/ |
|
3336 |
crhold(cr); |
|
3337 |
||
3338 |
/* |
|
3339 |
* p_lwpcnt can't change since this is a kernel process. |
|
3340 |
*/ |
|
3341 |
crset(pp, cr); |
|
3342 |
||
3343 |
/* |
|
3344 |
* Chroot |
|
3345 |
*/ |
|
3346 |
zone_chdir(zone->zone_rootvp, &PTOU(pp)->u_cdir, pp); |
|
3347 |
zone_chdir(zone->zone_rootvp, &PTOU(pp)->u_rdir, pp); |
|
3348 |
||
3349 |
/* |
|
3350 |
* Initialize zone's rctl set. |
|
3351 |
*/ |
|
3352 |
set = rctl_set_create(); |
|
3353 |
gp = rctl_set_init_prealloc(RCENTITY_ZONE); |
|
3354 |
mutex_enter(&pp->p_lock); |
|
3355 |
e.rcep_p.zone = zone; |
|
3356 |
e.rcep_t = RCENTITY_ZONE; |
|
3357 |
zone->zone_rctls = rctl_set_init(RCENTITY_ZONE, pp, &e, set, gp); |
|
3358 |
mutex_exit(&pp->p_lock); |
|
3359 |
rctl_prealloc_destroy(gp); |
|
3360 |
||
3361 |
/* |
|
3362 |
* Apply the rctls passed in to zone_create(). This is basically a list |
|
3363 |
* assignment: all of the old values are removed and the new ones |
|
3364 |
* inserted. That is, if an empty list is passed in, all values are |
|
3365 |
* removed. |
|
3366 |
*/ |
|
3367 |
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { |
|
3368 |
rctl_dict_entry_t *rde; |
|
3369 |
rctl_hndl_t hndl; |
|
3370 |
char *name; |
|
3371 |
nvlist_t **nvlarray; |
|
3372 |
uint_t i, nelem; |
|
3373 |
int error; /* For ASSERT()s */ |
|
3374 |
||
3375 |
name = nvpair_name(nvp); |
|
3376 |
hndl = rctl_hndl_lookup(name); |
|
3377 |
ASSERT(hndl != -1); |
|
3378 |
rde = rctl_dict_lookup_hndl(hndl); |
|
3379 |
ASSERT(rde != NULL); |
|
3380 |
||
3381 |
for (; /* ever */; ) { |
|
3382 |
rctl_val_t oval; |
|
3383 |
||
3384 |
mutex_enter(&pp->p_lock); |
|
3385 |
error = rctl_local_get(hndl, NULL, &oval, pp); |
|
3386 |
mutex_exit(&pp->p_lock); |
|
3387 |
ASSERT(error == 0); /* Can't fail for RCTL_FIRST */ |
|
3388 |
ASSERT(oval.rcv_privilege != RCPRIV_BASIC); |
|
3389 |
if (oval.rcv_privilege == RCPRIV_SYSTEM) |
|
3390 |
break; |
|
3391 |
mutex_enter(&pp->p_lock); |
|
3392 |
error = rctl_local_delete(hndl, &oval, pp); |
|
3393 |
mutex_exit(&pp->p_lock); |
|
3394 |
ASSERT(error == 0); |
|
3395 |
} |
|
3396 |
error = nvpair_value_nvlist_array(nvp, &nvlarray, &nelem); |
|
3397 |
ASSERT(error == 0); |
|
3398 |
for (i = 0; i < nelem; i++) { |
|
3399 |
rctl_val_t *nvalp; |
|
3400 |
||
3401 |
nvalp = kmem_cache_alloc(rctl_val_cache, KM_SLEEP); |
|
3402 |
error = nvlist2rctlval(nvlarray[i], nvalp); |
|
3403 |
ASSERT(error == 0); |
|
3404 |
/* |
|
3405 |
* rctl_local_insert can fail if the value being |
|
3406 |
* inserted is a duplicate; this is OK. |
|
3407 |
*/ |
|
3408 |
mutex_enter(&pp->p_lock); |
|
3409 |
if (rctl_local_insert(hndl, nvalp, pp) != 0) |
|
3410 |
kmem_cache_free(rctl_val_cache, nvalp); |
|
3411 |
mutex_exit(&pp->p_lock); |
|
3412 |
} |
|
3413 |
} |
|
3414 |
/* |
|
3415 |
* Tell the world that we're done setting up. |
|
3416 |
* |
|
5880 | 3417 |
* At this point we want to set the zone status to ZONE_IS_INITIALIZED |
0 | 3418 |
* and atomically set the zone's processor set visibility. Once |
3419 |
* we drop pool_lock() this zone will automatically get updated |
|
3420 |
* to reflect any future changes to the pools configuration. |
|
5880 | 3421 |
* |
3422 |
* Note that after we drop the locks below (zonehash_lock in |
|
3423 |
* particular) other operations such as a zone_getattr call can |
|
3424 |
* now proceed and observe the zone. That is the reason for doing a |
|
3425 |
* state transition to the INITIALIZED state. |
|
0 | 3426 |
*/ |
3427 |
pool_lock(); |
|
3428 |
mutex_enter(&cpu_lock); |
|
3429 |
mutex_enter(&zonehash_lock); |
|
3430 |
zone_uniqid(zone); |
|
3431 |
zone_zsd_configure(zone); |
|
3432 |
if (pool_state == POOL_ENABLED) |
|
3433 |
zone_pset_set(zone, pool_default->pool_pset->pset_id); |
|
3434 |
mutex_enter(&zone_status_lock); |
|
3435 |
ASSERT(zone_status_get(zone) == ZONE_IS_UNINITIALIZED); |
|
5880 | 3436 |
zone_status_set(zone, ZONE_IS_INITIALIZED); |
0 | 3437 |
mutex_exit(&zone_status_lock); |
3438 |
mutex_exit(&zonehash_lock); |
|
3439 |
mutex_exit(&cpu_lock); |
|
3440 |
pool_unlock(); |
|
3441 |
||
5880 | 3442 |
/* Now call the create callback for this key */ |
3443 |
zsd_apply_all_keys(zsd_apply_create, zone); |
|
3444 |
||
3445 |
/* The callbacks are complete. Mark ZONE_IS_READY */ |
|
3446 |
mutex_enter(&zone_status_lock); |
|
3447 |
ASSERT(zone_status_get(zone) == ZONE_IS_INITIALIZED); |
|
3448 |
zone_status_set(zone, ZONE_IS_READY); |
|
3449 |
mutex_exit(&zone_status_lock); |
|
3450 |
||
0 | 3451 |
/* |
3452 |
* Once we see the zone transition to the ZONE_IS_BOOTING state, |
|
3453 |
* we launch init, and set the state to running. |
|
3454 |
*/ |
|
3455 |
zone_status_wait_cpr(zone, ZONE_IS_BOOTING, "zsched"); |
|
3456 |
||
3457 |
if (zone_status_get(zone) == ZONE_IS_BOOTING) { |
|
3458 |
id_t cid; |
|
3459 |
||
3460 |
/* |
|
3461 |
* Ok, this is a little complicated. We need to grab the |
|
3462 |
* zone's pool's scheduling class ID; note that by now, we |
|
3463 |
* are already bound to a pool if we need to be (zoneadmd |
|
3464 |
* will have done that to us while we're in the READY |
|
3465 |
* state). *But* the scheduling class for the zone's 'init' |
|
3466 |
* must be explicitly passed to newproc, which doesn't |
|
3467 |
* respect pool bindings. |
|
3468 |
* |
|
3469 |
* We hold the pool_lock across the call to newproc() to |
|
3470 |
* close the obvious race: the pool's scheduling class |
|
3471 |
* could change before we manage to create the LWP with |
|
3472 |
* classid 'cid'. |
|
3473 |
*/ |
|
3474 |
pool_lock(); |
|
3247 | 3475 |
if (zone->zone_defaultcid > 0) |
3476 |
cid = zone->zone_defaultcid; |
|
3477 |
else |
|
3478 |
cid = pool_get_class(zone->zone_pool); |
|
0 | 3479 |
if (cid == -1) |
3480 |
cid = defaultcid; |
|
3481 |
||
3482 |
/* |
|
3483 |
* If this fails, zone_boot will ultimately fail. The |
|
3484 |
* state of the zone will be set to SHUTTING_DOWN-- userland |
|
3485 |
* will have to tear down the zone, and fail, or try again. |
|
3486 |
*/ |
|
2267 | 3487 |
if ((zone->zone_boot_err = newproc(zone_start_init, NULL, cid, |
11173
87f3734e64df
6881015 ZFS write activity prevents other threads from running in a timely manner
Jonathan Adams <Jonathan.Adams@Sun.COM>
parents:
11066
diff
changeset
|
3488 |
minclsyspri - 1, &ct, 0)) != 0) { |
0 | 3489 |
mutex_enter(&zone_status_lock); |
3490 |
zone_status_set(zone, ZONE_IS_SHUTTING_DOWN); |
|
3491 |
mutex_exit(&zone_status_lock); |
|
3492 |
} |
|
3493 |
pool_unlock(); |
|
3494 |
} |
|
3495 |
||
3496 |
/* |
|
3497 |
* Wait for zone_destroy() to be called. This is what we spend |
|
3498 |
* most of our life doing. |
|
3499 |
*/ |
|
3500 |
zone_status_wait_cpr(zone, ZONE_IS_DYING, "zsched"); |
|
3501 |
||
3502 |
if (ct) |
|
3503 |
/* |
|
3504 |
* At this point the process contract should be empty. |
|
3505 |
* (Though if it isn't, it's not the end of the world.) |
|
3506 |
*/ |
|
3507 |
VERIFY(contract_abandon(ct, curproc, B_TRUE) == 0); |
|
3508 |
||
3509 |
/* |
|
3510 |
* Allow kcred to be freed when all referring processes |
|
3511 |
* (including this one) go away. We can't just do this in |
|
3512 |
* zone_free because we need to wait for the zone_cred_ref to |
|
3513 |
* drop to 0 before calling zone_free, and the existence of |
|
3514 |
* zone_kcred will prevent that. Thus, we call crfree here to |
|
3515 |
* balance the crdup in zone_create. The crhold calls earlier |
|
3516 |
* in zsched will be dropped when the thread and process exit. |
|
3517 |
*/ |
|
3518 |
crfree(zone->zone_kcred); |
|
3519 |
zone->zone_kcred = NULL; |
|
3520 |
||
3521 |
exit(CLD_EXITED, 0); |
|
3522 |
} |
|
3523 |
||
3524 |
/* |
|
3525 |
* Helper function to determine if there are any submounts of the |
|
3526 |
* provided path. Used to make sure the zone doesn't "inherit" any |
|
3527 |
* mounts from before it is created. |
|
3528 |
*/ |
|
3529 |
static uint_t |
|
3530 |
zone_mount_count(const char *rootpath) |
|
3531 |
{ |
|
3532 |
vfs_t *vfsp; |
|
3533 |
uint_t count = 0; |
|
3534 |
size_t rootpathlen = strlen(rootpath); |
|
3535 |
||
3536 |
/* |
|
3537 |
* Holding zonehash_lock prevents race conditions with |
|
3538 |
* vfs_list_add()/vfs_list_remove() since we serialize with |
|
3539 |
* zone_find_by_path(). |
|
3540 |
*/ |
|
3541 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
3542 |
/* |
|
3543 |
* The rootpath must end with a '/' |
|
3544 |
*/ |
|
3545 |
ASSERT(rootpath[rootpathlen - 1] == '/'); |
|
3546 |
||
3547 |
/* |
|
3548 |
* This intentionally does not count the rootpath itself if that |
|
3549 |
* happens to be a mount point. |
|
3550 |
*/ |
|
3551 |
vfs_list_read_lock(); |
|
3552 |
vfsp = rootvfs; |
|
3553 |
do { |
|
3554 |
if (strncmp(rootpath, refstr_value(vfsp->vfs_mntpt), |
|
3555 |
rootpathlen) == 0) |
|
3556 |
count++; |
|
3557 |
vfsp = vfsp->vfs_next; |
|
3558 |
} while (vfsp != rootvfs); |
|
3559 |
vfs_list_unlock(); |
|
3560 |
return (count); |
|
3561 |
} |
|
3562 |
||
3563 |
/* |
|
3564 |
* Helper function to make sure that a zone created on 'rootpath' |
|
3565 |
* wouldn't end up containing other zones' rootpaths. |
|
3566 |
*/ |
|
3567 |
static boolean_t |
|
3568 |
zone_is_nested(const char *rootpath) |
|
3569 |
{ |
|
3570 |
zone_t *zone; |
|
3571 |
size_t rootpathlen = strlen(rootpath); |
|
3572 |
size_t len; |
|
3573 |
||
3574 |
ASSERT(MUTEX_HELD(&zonehash_lock)); |
|
3575 |
||
8799
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3576 |
/* |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3577 |
* zone_set_root() appended '/' and '\0' at the end of rootpath |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3578 |
*/ |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3579 |
if ((rootpathlen <= 3) && (rootpath[0] == '/') && |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3580 |
(rootpath[1] == '/') && (rootpath[2] == '\0')) |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3581 |
return (B_TRUE); |
bfcc15b6df34
5084037 zone_create(2) succeeds in creating new zone if global zone root path is passed.
Dhanaraj M <Dhanaraj.M@Sun.COM>
parents:
8662
diff
changeset
|
3582 |
|
0 | 3583 |
for (zone = list_head(&zone_active); zone != NULL; |
3584 |
zone = list_next(&zone_active, zone)) { |
|
3585 |
if (zone == global_zone) |
|
3586 |
continue; |
|
3587 |
len = strlen(zone->zone_rootpath); |
|
3588 |
if (strncmp(rootpath, zone->zone_rootpath, |
|
3589 |
MIN(rootpathlen, len)) == 0) |
|
3590 |
return (B_TRUE); |
|
3591 |
} |
|
3592 |
return (B_FALSE); |
|
3593 |
} |
|
3594 |
||
3595 |
static int |
|
813 | 3596 |
zone_set_privset(zone_t *zone, const priv_set_t *zone_privs, |
3597 |
size_t zone_privssz) |
|
0 | 3598 |
{ |
3599 |
priv_set_t *privs = kmem_alloc(sizeof (priv_set_t), KM_SLEEP); |
|
3600 |
||
813 | 3601 |
if (zone_privssz < sizeof (priv_set_t)) |
3602 |
return (set_errno(ENOMEM)); |
|
3603 |
||
0 | 3604 |
if (copyin(zone_privs, privs, sizeof (priv_set_t))) { |
3605 |
kmem_free(privs, sizeof (priv_set_t)); |
|
3606 |
return (EFAULT); |
|
3607 |
} |
|
3608 |
||
3609 |
zone->zone_privset = privs; |
|
3610 |
return (0); |
|
3611 |
} |
|
3612 |
||
3613 |
/* |
|
3614 |
* We make creative use of nvlists to pass in rctls from userland. The list is |
|
3615 |
* a list of the following structures: |
|
3616 |
* |
|
3617 |
* (name = rctl_name, value = nvpair_list_array) |
|
3618 |
* |
|
3619 |
* Where each element of the nvpair_list_array is of the form: |
|
3620 |
* |
|
3621 |
* [(name = "privilege", value = RCPRIV_PRIVILEGED), |
|
3622 |
* (name = "limit", value = uint64_t), |
|
3623 |
* (name = "action", value = (RCTL_LOCAL_NOACTION || RCTL_LOCAL_DENY))] |
|
3624 |
*/ |
|
3625 |
static int |
|
3626 |
parse_rctls(caddr_t ubuf, size_t buflen, nvlist_t **nvlp) |
|
3627 |
{ |
|
3628 |
nvpair_t *nvp = NULL; |
|
3629 |
nvlist_t *nvl = NULL; |
|
3630 |
char *kbuf; |
|
3631 |
int error; |
|
3632 |
rctl_val_t rv; |
|
3633 |
||
3634 |
*nvlp = NULL; |
|
3635 |
||
3636 |
if (buflen == 0) |
|
3637 |
return (0); |
|
3638 |
||
3639 |
if ((kbuf = kmem_alloc(buflen, KM_NOSLEEP)) == NULL) |
|
3640 |
return (ENOMEM); |
|
3641 |
if (copyin(ubuf, kbuf, buflen)) { |
|
3642 |
error = EFAULT; |
|
3643 |
goto out; |
|
3644 |
} |
|
3645 |
if (nvlist_unpack(kbuf, buflen, &nvl, KM_SLEEP) != 0) { |
|
3646 |
/* |
|
3647 |
* nvl may have been allocated/free'd, but the value set to |
|
3648 |
* non-NULL, so we reset it here. |
|
3649 |
*/ |
|
3650 |
nvl = NULL; |
|
3651 |
error = EINVAL; |
|
3652 |
goto out; |
|
3653 |
} |
|
3654 |
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { |
|
3655 |
rctl_dict_entry_t *rde; |
|
3656 |
rctl_hndl_t hndl; |
|
3657 |
nvlist_t **nvlarray; |
|
3658 |
uint_t i, nelem; |
|
3659 |
char *name; |
|
3660 |
||
3661 |
error = EINVAL; |
|
3662 |
name = nvpair_name(nvp); |
|
3663 |
if (strncmp(nvpair_name(nvp), "zone.", sizeof ("zone.") - 1) |
|
3664 |
!= 0 || nvpair_type(nvp) != DATA_TYPE_NVLIST_ARRAY) { |
|
3665 |
goto out; |
|
3666 |
} |
|
3667 |
if ((hndl = rctl_hndl_lookup(name)) == -1) { |
|
3668 |
goto out; |
|
3669 |
} |
|
3670 |
rde = rctl_dict_lookup_hndl(hndl); |
|
3671 |
error = nvpair_value_nvlist_array(nvp, &nvlarray, &nelem); |
|
3672 |
ASSERT(error == 0); |
|
3673 |
for (i = 0; i < nelem; i++) { |
|
3674 |
if (error = nvlist2rctlval(nvlarray[i], &rv)) |
|
3675 |
goto out; |
|
3676 |
} |
|
3677 |
if (rctl_invalid_value(rde, &rv)) { |
|
3678 |
error = EINVAL; |
|
3679 |
goto out; |
|
3680 |
} |
|
3681 |
} |
|
3682 |
error = 0; |
|
3683 |
*nvlp = nvl; |
|
3684 |
out: |
|
3685 |
kmem_free(kbuf, buflen); |
|
3686 |
if (error && nvl != NULL) |
|
3687 |
nvlist_free(nvl); |
|
3688 |
return (error); |
|
3689 |
} |
|
3690 |
||
3691 |
int |
|
3692 |
zone_create_error(int er_error, int er_ext, int *er_out) { |
|
3693 |
if (er_out != NULL) { |
|
3694 |
if (copyout(&er_ext, er_out, sizeof (int))) { |
|
3695 |
return (set_errno(EFAULT)); |
|
3696 |
} |
|
3697 |
} |
|
3698 |
return (set_errno(er_error)); |
|
3699 |
} |
|
3700 |
||
1676 | 3701 |
static int |
3702 |
zone_set_label(zone_t *zone, const bslabel_t *lab, uint32_t doi) |
|
3703 |
{ |
|
3704 |
ts_label_t *tsl; |
|
3705 |
bslabel_t blab; |
|
3706 |
||
3707 |
/* Get label from user */ |
|
3708 |
if (copyin(lab, &blab, sizeof (blab)) != 0) |
|
3709 |
return (EFAULT); |
|
3710 |
tsl = labelalloc(&blab, doi, KM_NOSLEEP); |
|
3711 |
if (tsl == NULL) |
|
3712 |
return (ENOMEM); |
|
3713 |
||
3714 |
zone->zone_slabel = tsl; |
|
3715 |
return (0); |
|
3716 |
} |
|
3717 |
||
0 | 3718 |
/* |
789 | 3719 |
* Parses a comma-separated list of ZFS datasets into a per-zone dictionary. |
3720 |
*/ |
|
3721 |
static int |
|
3722 |
parse_zfs(zone_t *zone, caddr_t ubuf, size_t buflen) |
|
3723 |
{ |
|
3724 |
char *kbuf; |
|
3725 |
char *dataset, *next; |
|
3726 |
zone_dataset_t *zd; |
|
3727 |
size_t len; |
|
3728 |
||
3729 |
if (ubuf == NULL || buflen == 0) |
|
3730 |
return (0); |
|
3731 |
||
3732 |
if ((kbuf = kmem_alloc(buflen, KM_NOSLEEP)) == NULL) |
|
3733 |
return (ENOMEM); |
|
3734 |
||
3735 |
if (copyin(ubuf, kbuf, buflen) != 0) { |
|
3736 |
kmem_free(kbuf, buflen); |
|
3737 |
return (EFAULT); |
|
3738 |
} |
|
3739 |
||
3740 |
dataset = next = kbuf; |
|
3741 |
for (;;) { |
|
3742 |
zd = kmem_alloc(sizeof (zone_dataset_t), KM_SLEEP); |
|
3743 |
||
3744 |
next = strchr(dataset, ','); |
|
3745 |
||
3746 |
if (next == NULL) |
|
3747 |
len = strlen(dataset); |
|
3748 |
else |
|
3749 |
len = next - dataset; |
|
3750 |
||
3751 |
zd->zd_dataset = kmem_alloc(len + 1, KM_SLEEP); |
|
3752 |
bcopy(dataset, zd->zd_dataset, len); |
|
3753 |
zd->zd_dataset[len] = '\0'; |
|
3754 |
||
3755 |
list_insert_head(&zone->zone_datasets, zd); |
|
3756 |
||
3757 |
if (next == NULL) |
|
3758 |
break; |
|
3759 |
||
3760 |
dataset = next + 1; |
|
3761 |
} |
|
3762 |
||
3763 |
kmem_free(kbuf, buflen); |
|
3764 |
return (0); |
|
3765 |
} |
|
3766 |
||
3767 |
/* |
|
0 | 3768 |
* System call to create/initialize a new zone named 'zone_name', rooted |
3769 |
* at 'zone_root', with a zone-wide privilege limit set of 'zone_privs', |
|
1676 | 3770 |
* and initialized with the zone-wide rctls described in 'rctlbuf', and |
3771 |
* with labeling set by 'match', 'doi', and 'label'. |
|
0 | 3772 |
* |
3773 |
* If extended error is non-null, we may use it to return more detailed |
|
3774 |
* error information. |
|
3775 |
*/ |
|
3776 |
static zoneid_t |
|
3777 |
zone_create(const char *zone_name, const char *zone_root, |
|
813 | 3778 |
const priv_set_t *zone_privs, size_t zone_privssz, |
3779 |
caddr_t rctlbuf, size_t rctlbufsz, |
|
1676 | 3780 |
caddr_t zfsbuf, size_t zfsbufsz, int *extended_error, |
3448 | 3781 |
int match, uint32_t doi, const bslabel_t *label, |
3782 |
int flags) |
|
0 | 3783 |
{ |
3784 |
struct zsched_arg zarg; |
|
3785 |
nvlist_t *rctls = NULL; |
|
3786 |
proc_t *pp = curproc; |
|
3787 |
zone_t *zone, *ztmp; |
|
3788 |
zoneid_t zoneid; |
|
3789 |
int error; |
|
3790 |
int error2 = 0; |
|
3791 |
char *str; |
|
3792 |
cred_t *zkcr; |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
3793 |
boolean_t insert_label_hash; |
0 | 3794 |
|
3795 |
if (secpolicy_zone_config(CRED()) != 0) |
|
3796 |
return (set_errno(EPERM)); |
|
3797 |
||
3798 |
/* can't boot zone from within chroot environment */ |
|
3799 |
if (PTOU(pp)->u_rdir != NULL && PTOU(pp)->u_rdir != rootdir) |
|
3800 |
return (zone_create_error(ENOTSUP, ZE_CHROOTED, |
|
813 | 3801 |
extended_error)); |
0 | 3802 |
|
3803 |
zone = kmem_zalloc(sizeof (zone_t), KM_SLEEP); |
|
3804 |
zoneid = zone->zone_id = id_alloc(zoneid_space); |
|
3805 |
zone->zone_status = ZONE_IS_UNINITIALIZED; |
|
3806 |
zone->zone_pool = pool_default; |
|
3807 |
zone->zone_pool_mod = gethrtime(); |
|
3808 |
zone->zone_psetid = ZONE_PS_INVAL; |
|
3809 |
zone->zone_ncpus = 0; |
|
3810 |
zone->zone_ncpus_online = 0; |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3811 |
zone->zone_restart_init = B_TRUE; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3812 |
zone->zone_brand = &native_brand; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
3813 |
zone->zone_initname = NULL; |
0 | 3814 |
mutex_init(&zone->zone_lock, NULL, MUTEX_DEFAULT, NULL); |
3815 |
mutex_init(&zone->zone_nlwps_lock, NULL, MUTEX_DEFAULT, NULL); |
|
3247 | 3816 |
mutex_init(&zone->zone_mem_lock, NULL, MUTEX_DEFAULT, NULL); |
0 | 3817 |
cv_init(&zone->zone_cv, NULL, CV_DEFAULT, NULL); |
3818 |
list_create(&zone->zone_zsd, sizeof (struct zsd_entry), |
|
3819 |
offsetof(struct zsd_entry, zsd_linkage)); |
|
789 | 3820 |
list_create(&zone->zone_datasets, sizeof (zone_dataset_t), |
3821 |
offsetof(zone_dataset_t, zd_linkage)); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
3822 |
list_create(&zone->zone_dl_list, sizeof (zone_dl_t), |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
3823 |
offsetof(zone_dl_t, zdl_linkage)); |
1676 | 3824 |
rw_init(&zone->zone_mlps.mlpl_rwlock, NULL, RW_DEFAULT, NULL); |
10910
951a65b3846b
PSARC/2009/566 Provide minor private interface modifications to support mntfs
Robert Harris <Robert.Harris@Sun.COM>
parents:
10865
diff
changeset
|
3825 |
rw_init(&zone->zone_mntfs_db_lock, NULL, RW_DEFAULT, NULL); |
0 | 3826 |
|
3448 | 3827 |
if (flags & ZCF_NET_EXCL) { |
3828 |
zone->zone_flags |= ZF_NET_EXCL; |
|
3829 |
} |
|
3830 |
||
0 | 3831 |
if ((error = zone_set_name(zone, zone_name)) != 0) { |
3832 |
zone_free(zone); |
|
3833 |
return (zone_create_error(error, 0, extended_error)); |
|
3834 |
} |
|
3835 |
||
3836 |
if ((error = zone_set_root(zone, zone_root)) != 0) { |
|
3837 |
zone_free(zone); |
|
3838 |
return (zone_create_error(error, 0, extended_error)); |
|
3839 |
} |
|
813 | 3840 |
if ((error = zone_set_privset(zone, zone_privs, zone_privssz)) != 0) { |
0 | 3841 |
zone_free(zone); |
3842 |
return (zone_create_error(error, 0, extended_error)); |
|
3843 |
} |
|
3844 |
||
3845 |
/* initialize node name to be the same as zone name */ |
|
3846 |
zone->zone_nodename = kmem_alloc(_SYS_NMLN, KM_SLEEP); |
|
3847 |
(void) strncpy(zone->zone_nodename, zone->zone_name, _SYS_NMLN); |
|
3848 |
zone->zone_nodename[_SYS_NMLN - 1] = '\0'; |
|
3849 |
||
3850 |
zone->zone_domain = kmem_alloc(_SYS_NMLN, KM_SLEEP); |
|
3851 |
zone->zone_domain[0] = '\0'; |
|
8662
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
3852 |
zone->zone_hostid = HW_INVALID_HOSTID; |
0 | 3853 |
zone->zone_shares = 1; |
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
3854 |
zone->zone_shmmax = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
3855 |
zone->zone_ipc.ipcq_shmmni = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
3856 |
zone->zone_ipc.ipcq_semmni = 0; |
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
3857 |
zone->zone_ipc.ipcq_msgmni = 0; |
0 | 3858 |
zone->zone_bootargs = NULL; |
2267 | 3859 |
zone->zone_initname = |
3860 |
kmem_alloc(strlen(zone_default_initname) + 1, KM_SLEEP); |
|
3861 |
(void) strcpy(zone->zone_initname, zone_default_initname); |
|
3247 | 3862 |
zone->zone_nlwps = 0; |
3863 |
zone->zone_nlwps_ctl = INT_MAX; |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3864 |
zone->zone_locked_mem = 0; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
3865 |
zone->zone_locked_mem_ctl = UINT64_MAX; |
3247 | 3866 |
zone->zone_max_swap = 0; |
3867 |
zone->zone_max_swap_ctl = UINT64_MAX; |
|
3868 |
zone0.zone_lockedmem_kstat = NULL; |
|
3869 |
zone0.zone_swapresv_kstat = NULL; |
|
0 | 3870 |
|
3871 |
/* |
|
3872 |
* Zsched initializes the rctls. |
|
3873 |
*/ |
|
3874 |
zone->zone_rctls = NULL; |
|
3875 |
||
3876 |
if ((error = parse_rctls(rctlbuf, rctlbufsz, &rctls)) != 0) { |
|
3877 |
zone_free(zone); |
|
3878 |
return (zone_create_error(error, 0, extended_error)); |
|
3879 |
} |
|
3880 |
||
789 | 3881 |
if ((error = parse_zfs(zone, zfsbuf, zfsbufsz)) != 0) { |
3882 |
zone_free(zone); |
|
3883 |
return (set_errno(error)); |
|
3884 |
} |
|
3885 |
||
0 | 3886 |
/* |
1676 | 3887 |
* Read in the trusted system parameters: |
3888 |
* match flag and sensitivity label. |
|
3889 |
*/ |
|
3890 |
zone->zone_match = match; |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
3891 |
if (is_system_labeled() && !(zone->zone_flags & ZF_IS_SCRATCH)) { |
4462
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3892 |
/* Fail if requested to set doi to anything but system's doi */ |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3893 |
if (doi != 0 && doi != default_doi) { |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3894 |
zone_free(zone); |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3895 |
return (set_errno(EINVAL)); |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3896 |
} |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3897 |
/* Always apply system's doi to the zone */ |
072b1d6f0ba2
CR 6507344 TX zones should not hard code DOI to 1
kp158701
parents:
4417
diff
changeset
|
3898 |
error = zone_set_label(zone, label, default_doi); |
1676 | 3899 |
if (error != 0) { |
3900 |
zone_free(zone); |
|
3901 |
return (set_errno(error)); |
|
3902 |
} |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
3903 |
insert_label_hash = B_TRUE; |
1676 | 3904 |
} else { |
3905 |
/* all zones get an admin_low label if system is not labeled */ |
|
3906 |
zone->zone_slabel = l_admin_low; |
|
3907 |
label_hold(l_admin_low); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
3908 |
insert_label_hash = B_FALSE; |
1676 | 3909 |
} |
3910 |
||
3911 |
/* |
|
0 | 3912 |
* Stop all lwps since that's what normally happens as part of fork(). |
3913 |
* This needs to happen before we grab any locks to avoid deadlock |
|
3914 |
* (another lwp in the process could be waiting for the held lock). |
|
3915 |
*/ |
|
3916 |
if (curthread != pp->p_agenttp && !holdlwps(SHOLDFORK)) { |
|
3917 |
zone_free(zone); |
|
3918 |
if (rctls) |
|
3919 |
nvlist_free(rctls); |
|
3920 |
return (zone_create_error(error, 0, extended_error)); |
|
3921 |
} |
|
3922 |
||
3923 |
if (block_mounts() == 0) { |
|
3924 |
mutex_enter(&pp->p_lock); |
|
3925 |
if (curthread != pp->p_agenttp) |
|
3926 |
continuelwps(pp); |
|
3927 |
mutex_exit(&pp->p_lock); |
|
3928 |
zone_free(zone); |
|
3929 |
if (rctls) |
|
3930 |
nvlist_free(rctls); |
|
3931 |
return (zone_create_error(error, 0, extended_error)); |
|
3932 |
} |
|
3933 |
||
3934 |
/* |
|
3935 |
* Set up credential for kernel access. After this, any errors |
|
3936 |
* should go through the dance in errout rather than calling |
|
3937 |
* zone_free directly. |
|
3938 |
*/ |
|
3939 |
zone->zone_kcred = crdup(kcred); |
|
3940 |
crsetzone(zone->zone_kcred, zone); |
|
3941 |
priv_intersect(zone->zone_privset, &CR_PPRIV(zone->zone_kcred)); |
|
3942 |
priv_intersect(zone->zone_privset, &CR_EPRIV(zone->zone_kcred)); |
|
3943 |
priv_intersect(zone->zone_privset, &CR_IPRIV(zone->zone_kcred)); |
|
3944 |
priv_intersect(zone->zone_privset, &CR_LPRIV(zone->zone_kcred)); |
|
3945 |
||
3946 |
mutex_enter(&zonehash_lock); |
|
3947 |
/* |
|
3948 |
* Make sure zone doesn't already exist. |
|
1676 | 3949 |
* |
3950 |
* If the system and zone are labeled, |
|
3951 |
* make sure no other zone exists that has the same label. |
|
0 | 3952 |
*/ |
1676 | 3953 |
if ((ztmp = zone_find_all_by_name(zone->zone_name)) != NULL || |
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
3954 |
(insert_label_hash && |
1676 | 3955 |
(ztmp = zone_find_all_by_label(zone->zone_slabel)) != NULL)) { |
0 | 3956 |
zone_status_t status; |
3957 |
||
3958 |
status = zone_status_get(ztmp); |
|
3959 |
if (status == ZONE_IS_READY || status == ZONE_IS_RUNNING) |
|
3960 |
error = EEXIST; |
|
3961 |
else |
|
3962 |
error = EBUSY; |
|
4791
4ee10afd3f9c
6504809 Misleading zoneadm message "zone_create failed: File exists"
ton
parents:
4664
diff
changeset
|
3963 |
|
4ee10afd3f9c
6504809 Misleading zoneadm message "zone_create failed: File exists"
ton
parents:
4664
diff
changeset
|
3964 |
if (insert_label_hash) |
4ee10afd3f9c
6504809 Misleading zoneadm message "zone_create failed: File exists"
ton
parents:
4664
diff
changeset
|
3965 |
error2 = ZE_LABELINUSE; |
4ee10afd3f9c
6504809 Misleading zoneadm message "zone_create failed: File exists"
ton
parents:
4664
diff
changeset
|
3966 |
|
0 | 3967 |
goto errout; |
3968 |
} |
|
3969 |
||
3970 |
/* |
|
3971 |
* Don't allow zone creations which would cause one zone's rootpath to |
|
3972 |
* be accessible from that of another (non-global) zone. |
|
3973 |
*/ |
|
3974 |
if (zone_is_nested(zone->zone_rootpath)) { |
|
3975 |
error = EBUSY; |
|
3976 |
goto errout; |
|
3977 |
} |
|
3978 |
||
3979 |
ASSERT(zonecount != 0); /* check for leaks */ |
|
3980 |
if (zonecount + 1 > maxzones) { |
|
3981 |
error = ENOMEM; |
|
3982 |
goto errout; |
|
3983 |
} |
|
3984 |
||
3985 |
if (zone_mount_count(zone->zone_rootpath) != 0) { |
|
3986 |
error = EBUSY; |
|
3987 |
error2 = ZE_AREMOUNTS; |
|
3988 |
goto errout; |
|
3989 |
} |
|
3990 |
||
3991 |
/* |
|
3992 |
* Zone is still incomplete, but we need to drop all locks while |
|
3993 |
* zsched() initializes this zone's kernel process. We |
|
3994 |
* optimistically add the zone to the hashtable and associated |
|
3995 |
* lists so a parallel zone_create() doesn't try to create the |
|
3996 |
* same zone. |
|
3997 |
*/ |
|
3998 |
zonecount++; |
|
3999 |
(void) mod_hash_insert(zonehashbyid, |
|
4000 |
(mod_hash_key_t)(uintptr_t)zone->zone_id, |
|
4001 |
(mod_hash_val_t)(uintptr_t)zone); |
|
4002 |
str = kmem_alloc(strlen(zone->zone_name) + 1, KM_SLEEP); |
|
4003 |
(void) strcpy(str, zone->zone_name); |
|
4004 |
(void) mod_hash_insert(zonehashbyname, (mod_hash_key_t)str, |
|
4005 |
(mod_hash_val_t)(uintptr_t)zone); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
4006 |
if (insert_label_hash) { |
1676 | 4007 |
(void) mod_hash_insert(zonehashbylabel, |
4008 |
(mod_hash_key_t)zone->zone_slabel, (mod_hash_val_t)zone); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
4009 |
zone->zone_flags |= ZF_HASHED_LABEL; |
1676 | 4010 |
} |
4011 |
||
0 | 4012 |
/* |
4013 |
* Insert into active list. At this point there are no 'hold's |
|
4014 |
* on the zone, but everyone else knows not to use it, so we can |
|
4015 |
* continue to use it. zsched() will do a zone_hold() if the |
|
4016 |
* newproc() is successful. |
|
4017 |
*/ |
|
4018 |
list_insert_tail(&zone_active, zone); |
|
4019 |
mutex_exit(&zonehash_lock); |
|
4020 |
||
4021 |
zarg.zone = zone; |
|
4022 |
zarg.nvlist = rctls; |
|
4023 |
/* |
|
4024 |
* The process, task, and project rctls are probably wrong; |
|
4025 |
* we need an interface to get the default values of all rctls, |
|
4026 |
* and initialize zsched appropriately. I'm not sure that that |
|
4027 |
* makes much of a difference, though. |
|
4028 |
*/ |
|
11173
87f3734e64df
6881015 ZFS write activity prevents other threads from running in a timely manner
Jonathan Adams <Jonathan.Adams@Sun.COM>
parents:
11066
diff
changeset
|
4029 |
error = newproc(zsched, (void *)&zarg, syscid, minclsyspri, NULL, 0); |
87f3734e64df
6881015 ZFS write activity prevents other threads from running in a timely manner
Jonathan Adams <Jonathan.Adams@Sun.COM>
parents:
11066
diff
changeset
|
4030 |
if (error != 0) { |
0 | 4031 |
/* |
4032 |
* We need to undo all globally visible state. |
|
4033 |
*/ |
|
4034 |
mutex_enter(&zonehash_lock); |
|
4035 |
list_remove(&zone_active, zone); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
4036 |
if (zone->zone_flags & ZF_HASHED_LABEL) { |
1676 | 4037 |
ASSERT(zone->zone_slabel != NULL); |
4038 |
(void) mod_hash_destroy(zonehashbylabel, |
|
4039 |
(mod_hash_key_t)zone->zone_slabel); |
|
4040 |
} |
|
0 | 4041 |
(void) mod_hash_destroy(zonehashbyname, |
4042 |
(mod_hash_key_t)(uintptr_t)zone->zone_name); |
|
4043 |
(void) mod_hash_destroy(zonehashbyid, |
|
4044 |
(mod_hash_key_t)(uintptr_t)zone->zone_id); |
|
4045 |
ASSERT(zonecount > 1); |
|
4046 |
zonecount--; |
|
4047 |
goto errout; |
|
4048 |
} |
|
4049 |
||
4050 |
/* |
|
4051 |
* Zone creation can't fail from now on. |
|
4052 |
*/ |
|
4053 |
||
4054 |
/* |
|
3247 | 4055 |
* Create zone kstats |
4056 |
*/ |
|
4057 |
zone_kstat_create(zone); |
|
4058 |
||
4059 |
/* |
|
0 | 4060 |
* Let the other lwps continue. |
4061 |
*/ |
|
4062 |
mutex_enter(&pp->p_lock); |
|
4063 |
if (curthread != pp->p_agenttp) |
|
4064 |
continuelwps(pp); |
|
4065 |
mutex_exit(&pp->p_lock); |
|
4066 |
||
4067 |
/* |
|
4068 |
* Wait for zsched to finish initializing the zone. |
|
4069 |
*/ |
|
4070 |
zone_status_wait(zone, ZONE_IS_READY); |
|
4071 |
/* |
|
4072 |
* The zone is fully visible, so we can let mounts progress. |
|
4073 |
*/ |
|
4074 |
resume_mounts(); |
|
4075 |
if (rctls) |
|
4076 |
nvlist_free(rctls); |
|
4077 |
||
4078 |
return (zoneid); |
|
4079 |
||
4080 |
errout: |
|
4081 |
mutex_exit(&zonehash_lock); |
|
4082 |
/* |
|
4083 |
* Let the other lwps continue. |
|
4084 |
*/ |
|
4085 |
mutex_enter(&pp->p_lock); |
|
4086 |
if (curthread != pp->p_agenttp) |
|
4087 |
continuelwps(pp); |
|
4088 |
mutex_exit(&pp->p_lock); |
|
4089 |
||
4090 |
resume_mounts(); |
|
4091 |
if (rctls) |
|
4092 |
nvlist_free(rctls); |
|
4093 |
/* |
|
4094 |
* There is currently one reference to the zone, a cred_ref from |
|
4095 |
* zone_kcred. To free the zone, we call crfree, which will call |
|
4096 |
* zone_cred_rele, which will call zone_free. |
|
4097 |
*/ |
|
4098 |
ASSERT(zone->zone_cred_ref == 1); /* for zone_kcred */ |
|
4099 |
ASSERT(zone->zone_kcred->cr_ref == 1); |
|
4100 |
ASSERT(zone->zone_ref == 0); |
|
4101 |
zkcr = zone->zone_kcred; |
|
4102 |
zone->zone_kcred = NULL; |
|
4103 |
crfree(zkcr); /* triggers call to zone_free */ |
|
4104 |
return (zone_create_error(error, error2, extended_error)); |
|
4105 |
} |
|
4106 |
||
4107 |
/* |
|
4108 |
* Cause the zone to boot. This is pretty simple, since we let zoneadmd do |
|
2267 | 4109 |
* the heavy lifting. initname is the path to the program to launch |
4110 |
* at the "top" of the zone; if this is NULL, we use the system default, |
|
4111 |
* which is stored at zone_default_initname. |
|
0 | 4112 |
*/ |
4113 |
static int |
|
2267 | 4114 |
zone_boot(zoneid_t zoneid) |
0 | 4115 |
{ |
4116 |
int err; |
|
4117 |
zone_t *zone; |
|
4118 |
||
4119 |
if (secpolicy_zone_config(CRED()) != 0) |
|
4120 |
return (set_errno(EPERM)); |
|
4121 |
if (zoneid < MIN_USERZONEID || zoneid > MAX_ZONEID) |
|
4122 |
return (set_errno(EINVAL)); |
|
4123 |
||
4124 |
mutex_enter(&zonehash_lock); |
|
4125 |
/* |
|
4126 |
* Look for zone under hash lock to prevent races with calls to |
|
4127 |
* zone_shutdown, zone_destroy, etc. |
|
4128 |
*/ |
|
4129 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
4130 |
mutex_exit(&zonehash_lock); |
|
4131 |
return (set_errno(EINVAL)); |
|
4132 |
} |
|
4133 |
||
4134 |
mutex_enter(&zone_status_lock); |
|
4135 |
if (zone_status_get(zone) != ZONE_IS_READY) { |
|
4136 |
mutex_exit(&zone_status_lock); |
|
4137 |
mutex_exit(&zonehash_lock); |
|
4138 |
return (set_errno(EINVAL)); |
|
4139 |
} |
|
4140 |
zone_status_set(zone, ZONE_IS_BOOTING); |
|
4141 |
mutex_exit(&zone_status_lock); |
|
4142 |
||
4143 |
zone_hold(zone); /* so we can use the zone_t later */ |
|
4144 |
mutex_exit(&zonehash_lock); |
|
4145 |
||
4146 |
if (zone_status_wait_sig(zone, ZONE_IS_RUNNING) == 0) { |
|
4147 |
zone_rele(zone); |
|
4148 |
return (set_errno(EINTR)); |
|
4149 |
} |
|
4150 |
||
4151 |
/* |
|
4152 |
* Boot (starting init) might have failed, in which case the zone |
|
4153 |
* will go to the SHUTTING_DOWN state; an appropriate errno will |
|
4154 |
* be placed in zone->zone_boot_err, and so we return that. |
|
4155 |
*/ |
|
4156 |
err = zone->zone_boot_err; |
|
4157 |
zone_rele(zone); |
|
4158 |
return (err ? set_errno(err) : 0); |
|
4159 |
} |
|
4160 |
||
4161 |
/* |
|
4162 |
* Kills all user processes in the zone, waiting for them all to exit |
|
4163 |
* before returning. |
|
4164 |
*/ |
|
4165 |
static int |
|
4166 |
zone_empty(zone_t *zone) |
|
4167 |
{ |
|
4168 |
int waitstatus; |
|
4169 |
||
4170 |
/* |
|
4171 |
* We need to drop zonehash_lock before killing all |
|
4172 |
* processes, otherwise we'll deadlock with zone_find_* |
|
4173 |
* which can be called from the exit path. |
|
4174 |
*/ |
|
4175 |
ASSERT(MUTEX_NOT_HELD(&zonehash_lock)); |
|
11066
cebb50cbe4f9
PSARC/2009/396 Tickless Kernel Architecture / lbolt decoupling
Rafael Vanoni <rafael.vanoni@sun.com>
parents:
10910
diff
changeset
|
4176 |
while ((waitstatus = zone_status_timedwait_sig(zone, |
cebb50cbe4f9
PSARC/2009/396 Tickless Kernel Architecture / lbolt decoupling
Rafael Vanoni <rafael.vanoni@sun.com>
parents:
10910
diff
changeset
|
4177 |
ddi_get_lbolt() + hz, ZONE_IS_EMPTY)) == -1) { |
0 | 4178 |
killall(zone->zone_id); |
4179 |
} |
|
4180 |
/* |
|
4181 |
* return EINTR if we were signaled |
|
4182 |
*/ |
|
4183 |
if (waitstatus == 0) |
|
4184 |
return (EINTR); |
|
4185 |
return (0); |
|
4186 |
} |
|
4187 |
||
4188 |
/* |
|
1676 | 4189 |
* This function implements the policy for zone visibility. |
4190 |
* |
|
4191 |
* In standard Solaris, a non-global zone can only see itself. |
|
4192 |
* |
|
4193 |
* In Trusted Extensions, a labeled zone can lookup any zone whose label |
|
4194 |
* it dominates. For this test, the label of the global zone is treated as |
|
4195 |
* admin_high so it is special-cased instead of being checked for dominance. |
|
4196 |
* |
|
4197 |
* Returns true if zone attributes are viewable, false otherwise. |
|
4198 |
*/ |
|
4199 |
static boolean_t |
|
4200 |
zone_list_access(zone_t *zone) |
|
4201 |
{ |
|
4202 |
||
4203 |
if (curproc->p_zone == global_zone || |
|
4204 |
curproc->p_zone == zone) { |
|
4205 |
return (B_TRUE); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
4206 |
} else if (is_system_labeled() && !(zone->zone_flags & ZF_IS_SCRATCH)) { |
1676 | 4207 |
bslabel_t *curproc_label; |
4208 |
bslabel_t *zone_label; |
|
4209 |
||
4210 |
curproc_label = label2bslabel(curproc->p_zone->zone_slabel); |
|
4211 |
zone_label = label2bslabel(zone->zone_slabel); |
|
4212 |
||
4213 |
if (zone->zone_id != GLOBAL_ZONEID && |
|
4214 |
bldominates(curproc_label, zone_label)) { |
|
4215 |
return (B_TRUE); |
|
4216 |
} else { |
|
4217 |
return (B_FALSE); |
|
4218 |
} |
|
4219 |
} else { |
|
4220 |
return (B_FALSE); |
|
4221 |
} |
|
4222 |
} |
|
4223 |
||
4224 |
/* |
|
0 | 4225 |
* Systemcall to start the zone's halt sequence. By the time this |
4226 |
* function successfully returns, all user processes and kernel threads |
|
4227 |
* executing in it will have exited, ZSD shutdown callbacks executed, |
|
4228 |
* and the zone status set to ZONE_IS_DOWN. |
|
4229 |
* |
|
4230 |
* It is possible that the call will interrupt itself if the caller is the |
|
4231 |
* parent of any process running in the zone, and doesn't have SIGCHLD blocked. |
|
4232 |
*/ |
|
4233 |
static int |
|
4234 |
zone_shutdown(zoneid_t zoneid) |
|
4235 |
{ |
|
4236 |
int error; |
|
4237 |
zone_t *zone; |
|
4238 |
zone_status_t status; |
|
4239 |
||
4240 |
if (secpolicy_zone_config(CRED()) != 0) |
|
4241 |
return (set_errno(EPERM)); |
|
4242 |
if (zoneid < MIN_USERZONEID || zoneid > MAX_ZONEID) |
|
4243 |
return (set_errno(EINVAL)); |
|
4244 |
||
4245 |
/* |
|
4246 |
* Block mounts so that VFS_MOUNT() can get an accurate view of |
|
4247 |
* the zone's status with regards to ZONE_IS_SHUTTING down. |
|
4248 |
* |
|
4249 |
* e.g. NFS can fail the mount if it determines that the zone |
|
4250 |
* has already begun the shutdown sequence. |
|
4251 |
*/ |
|
4252 |
if (block_mounts() == 0) |
|
4253 |
return (set_errno(EINTR)); |
|
4254 |
mutex_enter(&zonehash_lock); |
|
4255 |
/* |
|
4256 |
* Look for zone under hash lock to prevent races with other |
|
4257 |
* calls to zone_shutdown and zone_destroy. |
|
4258 |
*/ |
|
4259 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
4260 |
mutex_exit(&zonehash_lock); |
|
4261 |
resume_mounts(); |
|
4262 |
return (set_errno(EINVAL)); |
|
4263 |
} |
|
4264 |
mutex_enter(&zone_status_lock); |
|
4265 |
status = zone_status_get(zone); |
|
4266 |
/* |
|
4267 |
* Fail if the zone isn't fully initialized yet. |
|
4268 |
*/ |
|
4269 |
if (status < ZONE_IS_READY) { |
|
4270 |
mutex_exit(&zone_status_lock); |
|
4271 |
mutex_exit(&zonehash_lock); |
|
4272 |
resume_mounts(); |
|
4273 |
return (set_errno(EINVAL)); |
|
4274 |
} |
|
4275 |
/* |
|
4276 |
* If conditions required for zone_shutdown() to return have been met, |
|
4277 |
* return success. |
|
4278 |
*/ |
|
4279 |
if (status >= ZONE_IS_DOWN) { |
|
4280 |
mutex_exit(&zone_status_lock); |
|
4281 |
mutex_exit(&zonehash_lock); |
|
4282 |
resume_mounts(); |
|
4283 |
return (0); |
|
4284 |
} |
|
4285 |
/* |
|
4286 |
* If zone_shutdown() hasn't been called before, go through the motions. |
|
4287 |
* If it has, there's nothing to do but wait for the kernel threads to |
|
4288 |
* drain. |
|
4289 |
*/ |
|
4290 |
if (status < ZONE_IS_EMPTY) { |
|
4291 |
uint_t ntasks; |
|
4292 |
||
4293 |
mutex_enter(&zone->zone_lock); |
|
4294 |
if ((ntasks = zone->zone_ntasks) != 1) { |
|
4295 |
/* |
|
4296 |
* There's still stuff running. |
|
4297 |
*/ |
|
4298 |
zone_status_set(zone, ZONE_IS_SHUTTING_DOWN); |
|
4299 |
} |
|
4300 |
mutex_exit(&zone->zone_lock); |
|
4301 |
if (ntasks == 1) { |
|
4302 |
/* |
|
4303 |
* The only way to create another task is through |
|
4304 |
* zone_enter(), which will block until we drop |
|
4305 |
* zonehash_lock. The zone is empty. |
|
4306 |
*/ |
|
4307 |
if (zone->zone_kthreads == NULL) { |
|
4308 |
/* |
|
4309 |
* Skip ahead to ZONE_IS_DOWN |
|
4310 |
*/ |
|
4311 |
zone_status_set(zone, ZONE_IS_DOWN); |
|
4312 |
} else { |
|
4313 |
zone_status_set(zone, ZONE_IS_EMPTY); |
|
4314 |
} |
|
4315 |
} |
|
4316 |
} |
|
4317 |
zone_hold(zone); /* so we can use the zone_t later */ |
|
4318 |
mutex_exit(&zone_status_lock); |
|
4319 |
mutex_exit(&zonehash_lock); |
|
4320 |
resume_mounts(); |
|
4321 |
||
4322 |
if (error = zone_empty(zone)) { |
|
4323 |
zone_rele(zone); |
|
4324 |
return (set_errno(error)); |
|
4325 |
} |
|
4326 |
/* |
|
4327 |
* After the zone status goes to ZONE_IS_DOWN this zone will no |
|
4328 |
* longer be notified of changes to the pools configuration, so |
|
4329 |
* in order to not end up with a stale pool pointer, we point |
|
4330 |
* ourselves at the default pool and remove all resource |
|
4331 |
* visibility. This is especially important as the zone_t may |
|
4332 |
* languish on the deathrow for a very long time waiting for |
|
4333 |
* cred's to drain out. |
|
4334 |
* |
|
4335 |
* This rebinding of the zone can happen multiple times |
|
4336 |
* (presumably due to interrupted or parallel systemcalls) |
|
4337 |
* without any adverse effects. |
|
4338 |
*/ |
|
4339 |
if (pool_lock_intr() != 0) { |
|
4340 |
zone_rele(zone); |
|
4341 |
return (set_errno(EINTR)); |
|
4342 |
} |
|
4343 |
if (pool_state == POOL_ENABLED) { |
|
4344 |
mutex_enter(&cpu_lock); |
|
4345 |
zone_pool_set(zone, pool_default); |
|
4346 |
/* |
|
4347 |
* The zone no longer needs to be able to see any cpus. |
|
4348 |
*/ |
|
4349 |
zone_pset_set(zone, ZONE_PS_INVAL); |
|
4350 |
mutex_exit(&cpu_lock); |
|
4351 |
} |
|
4352 |
pool_unlock(); |
|
4353 |
||
4354 |
/* |
|
4355 |
* ZSD shutdown callbacks can be executed multiple times, hence |
|
4356 |
* it is safe to not be holding any locks across this call. |
|
4357 |
*/ |
|
4358 |
zone_zsd_callbacks(zone, ZSD_SHUTDOWN); |
|
4359 |
||
4360 |
mutex_enter(&zone_status_lock); |
|
4361 |
if (zone->zone_kthreads == NULL && zone_status_get(zone) < ZONE_IS_DOWN) |
|
4362 |
zone_status_set(zone, ZONE_IS_DOWN); |
|
4363 |
mutex_exit(&zone_status_lock); |
|
4364 |
||
4365 |
/* |
|
4366 |
* Wait for kernel threads to drain. |
|
4367 |
*/ |
|
4368 |
if (!zone_status_wait_sig(zone, ZONE_IS_DOWN)) { |
|
4369 |
zone_rele(zone); |
|
4370 |
return (set_errno(EINTR)); |
|
4371 |
} |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4372 |
|
3671
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4373 |
/* |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4374 |
* Zone can be become down/destroyable even if the above wait |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4375 |
* returns EINTR, so any code added here may never execute. |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4376 |
* (i.e. don't add code here) |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4377 |
*/ |
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4378 |
|
0 | 4379 |
zone_rele(zone); |
4380 |
return (0); |
|
4381 |
} |
|
4382 |
||
4383 |
/* |
|
4384 |
* Systemcall entry point to finalize the zone halt process. The caller |
|
2677
212d61b14a8b
PSARC/2006/451 System V resource controls for Zones
ml93401
parents:
2267
diff
changeset
|
4385 |
* must have already successfully called zone_shutdown(). |
0 | 4386 |
* |
4387 |
* Upon successful completion, the zone will have been fully destroyed: |
|
4388 |
* zsched will have exited, destructor callbacks executed, and the zone |
|
4389 |
* removed from the list of active zones. |
|
4390 |
*/ |
|
4391 |
static int |
|
4392 |
zone_destroy(zoneid_t zoneid) |
|
4393 |
{ |
|
4394 |
uint64_t uniqid; |
|
4395 |
zone_t *zone; |
|
4396 |
zone_status_t status; |
|
4397 |
||
4398 |
if (secpolicy_zone_config(CRED()) != 0) |
|
4399 |
return (set_errno(EPERM)); |
|
4400 |
if (zoneid < MIN_USERZONEID || zoneid > MAX_ZONEID) |
|
4401 |
return (set_errno(EINVAL)); |
|
4402 |
||
4403 |
mutex_enter(&zonehash_lock); |
|
4404 |
/* |
|
4405 |
* Look for zone under hash lock to prevent races with other |
|
4406 |
* calls to zone_destroy. |
|
4407 |
*/ |
|
4408 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
4409 |
mutex_exit(&zonehash_lock); |
|
4410 |
return (set_errno(EINVAL)); |
|
4411 |
} |
|
4412 |
||
4413 |
if (zone_mount_count(zone->zone_rootpath) != 0) { |
|
4414 |
mutex_exit(&zonehash_lock); |
|
4415 |
return (set_errno(EBUSY)); |
|
4416 |
} |
|
4417 |
mutex_enter(&zone_status_lock); |
|
4418 |
status = zone_status_get(zone); |
|
4419 |
if (status < ZONE_IS_DOWN) { |
|
4420 |
mutex_exit(&zone_status_lock); |
|
4421 |
mutex_exit(&zonehash_lock); |
|
4422 |
return (set_errno(EBUSY)); |
|
4423 |
} else if (status == ZONE_IS_DOWN) { |
|
4424 |
zone_status_set(zone, ZONE_IS_DYING); /* Tell zsched to exit */ |
|
4425 |
} |
|
4426 |
mutex_exit(&zone_status_lock); |
|
4427 |
zone_hold(zone); |
|
4428 |
mutex_exit(&zonehash_lock); |
|
4429 |
||
4430 |
/* |
|
4431 |
* wait for zsched to exit |
|
4432 |
*/ |
|
4433 |
zone_status_wait(zone, ZONE_IS_DEAD); |
|
4434 |
zone_zsd_callbacks(zone, ZSD_DESTROY); |
|
3448 | 4435 |
zone->zone_netstack = NULL; |
0 | 4436 |
uniqid = zone->zone_uniqid; |
4437 |
zone_rele(zone); |
|
4438 |
zone = NULL; /* potentially free'd */ |
|
4439 |
||
4440 |
mutex_enter(&zonehash_lock); |
|
4441 |
for (; /* ever */; ) { |
|
4442 |
boolean_t unref; |
|
4443 |
||
4444 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL || |
|
4445 |
zone->zone_uniqid != uniqid) { |
|
4446 |
/* |
|
4447 |
* The zone has gone away. Necessary conditions |
|
4448 |
* are met, so we return success. |
|
4449 |
*/ |
|
4450 |
mutex_exit(&zonehash_lock); |
|
4451 |
return (0); |
|
4452 |
} |
|
4453 |
mutex_enter(&zone->zone_lock); |
|
4454 |
unref = ZONE_IS_UNREF(zone); |
|
4455 |
mutex_exit(&zone->zone_lock); |
|
4456 |
if (unref) { |
|
4457 |
/* |
|
4458 |
* There is only one reference to the zone -- that |
|
4459 |
* added when the zone was added to the hashtables -- |
|
4460 |
* and things will remain this way until we drop |
|
4461 |
* zonehash_lock... we can go ahead and cleanup the |
|
4462 |
* zone. |
|
4463 |
*/ |
|
4464 |
break; |
|
4465 |
} |
|
4466 |
||
4467 |
if (cv_wait_sig(&zone_destroy_cv, &zonehash_lock) == 0) { |
|
4468 |
/* Signaled */ |
|
4469 |
mutex_exit(&zonehash_lock); |
|
4470 |
return (set_errno(EINTR)); |
|
4471 |
} |
|
4472 |
||
4473 |
} |
|
4474 |
||
3792 | 4475 |
/* |
4476 |
* Remove CPU cap for this zone now since we're not going to |
|
4477 |
* fail below this point. |
|
4478 |
*/ |
|
4479 |
cpucaps_zone_remove(zone); |
|
4480 |
||
4481 |
/* Get rid of the zone's kstats */ |
|
3247 | 4482 |
zone_kstat_delete(zone); |
4483 |
||
12273
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4484 |
/* remove the pfexecd doors */ |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4485 |
if (zone->zone_pfexecd != NULL) { |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4486 |
klpd_freelist(&zone->zone_pfexecd); |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4487 |
zone->zone_pfexecd = NULL; |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4488 |
} |
63678502e95e
PSARC 2009/377 In-kernel pfexec implementation.
Casper H.S. Dik <Casper.Dik@Sun.COM>
parents:
11861
diff
changeset
|
4489 |
|
4888
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
4490 |
/* free brand specific data */ |
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
4491 |
if (ZONE_IS_BRANDED(zone)) |
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
4492 |
ZBROP(zone)->b_free_brand_data(zone); |
51ac39c1472f
6574205 No support for abstract namespace UNIX sockets in lx brand library emulation
eh208807
parents:
4846
diff
changeset
|
4493 |
|
3671
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4494 |
/* Say goodbye to brand framework. */ |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4495 |
brand_unregister_zone(zone->zone_brand); |
48011e38989d
6500376 projdel from SUNWesu in SUNWCreq in s10/snv has missing dependency
sl108498
parents:
3448
diff
changeset
|
4496 |
|
0 | 4497 |
/* |
4498 |
* It is now safe to let the zone be recreated; remove it from the |
|
4499 |
* lists. The memory will not be freed until the last cred |
|
4500 |
* reference goes away. |
|
4501 |
*/ |
|
4502 |
ASSERT(zonecount > 1); /* must be > 1; can't destroy global zone */ |
|
4503 |
zonecount--; |
|
4504 |
/* remove from active list and hash tables */ |
|
4505 |
list_remove(&zone_active, zone); |
|
4506 |
(void) mod_hash_destroy(zonehashbyname, |
|
4507 |
(mod_hash_key_t)zone->zone_name); |
|
4508 |
(void) mod_hash_destroy(zonehashbyid, |
|
4509 |
(mod_hash_key_t)(uintptr_t)zone->zone_id); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
4510 |
if (zone->zone_flags & ZF_HASHED_LABEL) |
1676 | 4511 |
(void) mod_hash_destroy(zonehashbylabel, |
4512 |
(mod_hash_key_t)zone->zone_slabel); |
|
0 | 4513 |
mutex_exit(&zonehash_lock); |
4514 |
||
766 | 4515 |
/* |
4516 |
* Release the root vnode; we're not using it anymore. Nor should any |
|
4517 |
* other thread that might access it exist. |
|
4518 |
*/ |
|
4519 |
if (zone->zone_rootvp != NULL) { |
|
4520 |
VN_RELE(zone->zone_rootvp); |
|
4521 |
zone->zone_rootvp = NULL; |
|
4522 |
} |
|
4523 |
||
0 | 4524 |
/* add to deathrow list */ |
4525 |
mutex_enter(&zone_deathrow_lock); |
|
4526 |
list_insert_tail(&zone_deathrow, zone); |
|
4527 |
mutex_exit(&zone_deathrow_lock); |
|
4528 |
||
4529 |
/* |
|
4530 |
* Drop last reference (which was added by zsched()), this will |
|
4531 |
* free the zone unless there are outstanding cred references. |
|
4532 |
*/ |
|
4533 |
zone_rele(zone); |
|
4534 |
return (0); |
|
4535 |
} |
|
4536 |
||
4537 |
/* |
|
4538 |
* Systemcall entry point for zone_getattr(2). |
|
4539 |
*/ |
|
4540 |
static ssize_t |
|
4541 |
zone_getattr(zoneid_t zoneid, int attr, void *buf, size_t bufsize) |
|
4542 |
{ |
|
4543 |
size_t size; |
|
4544 |
int error = 0, err; |
|
4545 |
zone_t *zone; |
|
4546 |
char *zonepath; |
|
2267 | 4547 |
char *outstr; |
0 | 4548 |
zone_status_t zone_status; |
4549 |
pid_t initpid; |
|
3792 | 4550 |
boolean_t global = (curzone == global_zone); |
4551 |
boolean_t inzone = (curzone->zone_id == zoneid); |
|
3448 | 4552 |
ushort_t flags; |
0 | 4553 |
|
4554 |
mutex_enter(&zonehash_lock); |
|
4555 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
4556 |
mutex_exit(&zonehash_lock); |
|
4557 |
return (set_errno(EINVAL)); |
|
4558 |
} |
|
4559 |
zone_status = zone_status_get(zone); |
|
5880 | 4560 |
if (zone_status < ZONE_IS_INITIALIZED) { |
0 | 4561 |
mutex_exit(&zonehash_lock); |
4562 |
return (set_errno(EINVAL)); |
|
4563 |
} |
|
4564 |
zone_hold(zone); |
|
4565 |
mutex_exit(&zonehash_lock); |
|
4566 |
||
4567 |
/* |
|
1676 | 4568 |
* If not in the global zone, don't show information about other zones, |
4569 |
* unless the system is labeled and the local zone's label dominates |
|
4570 |
* the other zone. |
|
0 | 4571 |
*/ |
1676 | 4572 |
if (!zone_list_access(zone)) { |
0 | 4573 |
zone_rele(zone); |
4574 |
return (set_errno(EINVAL)); |
|
4575 |
} |
|
4576 |
||
4577 |
switch (attr) { |
|
4578 |
case ZONE_ATTR_ROOT: |
|
4579 |
if (global) { |
|
4580 |
/* |
|
4581 |
* Copy the path to trim the trailing "/" (except for |
|
4582 |
* the global zone). |
|
4583 |
*/ |
|
4584 |
if (zone != global_zone) |
|
4585 |
size = zone->zone_rootpathlen - 1; |
|
4586 |
else |
|
4587 |
size = zone->zone_rootpathlen; |
|
4588 |
zonepath = kmem_alloc(size, KM_SLEEP); |
|
4589 |
bcopy(zone->zone_rootpath, zonepath, size); |
|
4590 |
zonepath[size - 1] = '\0'; |
|
4591 |
} else { |
|
3792 | 4592 |
if (inzone || !is_system_labeled()) { |
1676 | 4593 |
/* |
4594 |
* Caller is not in the global zone. |
|
4595 |
* if the query is on the current zone |
|
4596 |
* or the system is not labeled, |
|
4597 |
* just return faked-up path for current zone. |
|
4598 |
*/ |
|
4599 |
zonepath = "/"; |
|
4600 |
size = 2; |
|
4601 |
} else { |
|
4602 |
/* |
|
4603 |
* Return related path for current zone. |
|
4604 |
*/ |
|
4605 |
int prefix_len = strlen(zone_prefix); |
|
4606 |
int zname_len = strlen(zone->zone_name); |
|
4607 |
||
4608 |
size = prefix_len + zname_len + 1; |
|
4609 |
zonepath = kmem_alloc(size, KM_SLEEP); |
|
4610 |
bcopy(zone_prefix, zonepath, prefix_len); |
|
4611 |
bcopy(zone->zone_name, zonepath + |
|
2267 | 4612 |
prefix_len, zname_len); |
1676 | 4613 |
zonepath[size - 1] = '\0'; |
4614 |
} |
|
0 | 4615 |
} |
4616 |
if (bufsize > size) |
|
4617 |
bufsize = size; |
|
4618 |
if (buf != NULL) { |
|
4619 |
err = copyoutstr(zonepath, buf, bufsize, NULL); |
|
4620 |
if (err != 0 && err != ENAMETOOLONG) |
|
4621 |
error = EFAULT; |
|
4622 |
} |
|
3792 | 4623 |
if (global || (is_system_labeled() && !inzone)) |
0 | 4624 |
kmem_free(zonepath, size); |
4625 |
break; |
|
4626 |
||
4627 |
case ZONE_ATTR_NAME: |
|
4628 |
size = strlen(zone->zone_name) + 1; |
|
4629 |
if (bufsize > size) |
|
4630 |
bufsize = size; |
|
4631 |
if (buf != NULL) { |
|
4632 |
err = copyoutstr(zone->zone_name, buf, bufsize, NULL); |
|
4633 |
if (err != 0 && err != ENAMETOOLONG) |
|
4634 |
error = EFAULT; |
|
4635 |
} |
|
4636 |
break; |
|
4637 |
||
4638 |
case ZONE_ATTR_STATUS: |
|
4639 |
/* |
|
4640 |
* Since we're not holding zonehash_lock, the zone status |
|
4641 |
* may be anything; leave it up to userland to sort it out. |
|
4642 |
*/ |
|
4643 |
size = sizeof (zone_status); |
|
4644 |
if (bufsize > size) |
|
4645 |
bufsize = size; |
|
4646 |
zone_status = zone_status_get(zone); |
|
4647 |
if (buf != NULL && |
|
4648 |
copyout(&zone_status, buf, bufsize) != 0) |
|
4649 |
error = EFAULT; |
|
4650 |
break; |
|
3448 | 4651 |
case ZONE_ATTR_FLAGS: |
4652 |
size = sizeof (zone->zone_flags); |
|
4653 |
if (bufsize > size) |
|
4654 |
bufsize = size; |
|
4655 |
flags = zone->zone_flags; |
|
4656 |
if (buf != NULL && |
|
4657 |
copyout(&flags, buf, bufsize) != 0) |
|
4658 |
error = EFAULT; |
|
4659 |
break; |
|
0 | 4660 |
case ZONE_ATTR_PRIVSET: |
4661 |
size = sizeof (priv_set_t); |
|
4662 |
if (bufsize > size) |
|
4663 |
bufsize = size; |
|
4664 |
if (buf != NULL && |
|
4665 |
copyout(zone->zone_privset, buf, bufsize) != 0) |
|
4666 |
error = EFAULT; |
|
4667 |
break; |
|
4668 |
case ZONE_ATTR_UNIQID: |
|
4669 |
size = sizeof (zone->zone_uniqid); |
|
4670 |
if (bufsize > size) |
|
4671 |
bufsize = size; |
|
4672 |
if (buf != NULL && |
|
4673 |
copyout(&zone->zone_uniqid, buf, bufsize) != 0) |
|
4674 |
error = EFAULT; |
|
4675 |
break; |
|
4676 |
case ZONE_ATTR_POOLID: |
|
4677 |
{ |
|
4678 |
pool_t *pool; |
|
4679 |
poolid_t poolid; |
|
4680 |
||
4681 |
if (pool_lock_intr() != 0) { |
|
4682 |
error = EINTR; |
|
4683 |
break; |
|
4684 |
} |
|
4685 |
pool = zone_pool_get(zone); |
|
4686 |
poolid = pool->pool_id; |
|
4687 |
pool_unlock(); |
|
4688 |
size = sizeof (poolid); |
|
4689 |
if (bufsize > size) |
|
4690 |
bufsize = size; |
|
4691 |
if (buf != NULL && copyout(&poolid, buf, size) != 0) |
|
4692 |
error = EFAULT; |
|
4693 |
} |
|
4694 |
break; |
|
1676 | 4695 |
case ZONE_ATTR_SLBL: |
4696 |
size = sizeof (bslabel_t); |
|
4697 |
if (bufsize > size) |
|
4698 |
bufsize = size; |
|
4699 |
if (zone->zone_slabel == NULL) |
|
4700 |
error = EINVAL; |
|
4701 |
else if (buf != NULL && |
|
4702 |
copyout(label2bslabel(zone->zone_slabel), buf, |
|
4703 |
bufsize) != 0) |
|
4704 |
error = EFAULT; |
|
4705 |
break; |
|
0 | 4706 |
case ZONE_ATTR_INITPID: |
4707 |
size = sizeof (initpid); |
|
4708 |
if (bufsize > size) |
|
4709 |
bufsize = size; |
|
4710 |
initpid = zone->zone_proc_initpid; |
|
4711 |
if (initpid == -1) { |
|
4712 |
error = ESRCH; |
|
4713 |
break; |
|
4714 |
} |
|
4715 |
if (buf != NULL && |
|
4716 |
copyout(&initpid, buf, bufsize) != 0) |
|
4717 |
error = EFAULT; |
|
4718 |
break; |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4719 |
case ZONE_ATTR_BRAND: |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4720 |
size = strlen(zone->zone_brand->b_name) + 1; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4721 |
|
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4722 |
if (bufsize > size) |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4723 |
bufsize = size; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4724 |
if (buf != NULL) { |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4725 |
err = copyoutstr(zone->zone_brand->b_name, buf, |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4726 |
bufsize, NULL); |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4727 |
if (err != 0 && err != ENAMETOOLONG) |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4728 |
error = EFAULT; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4729 |
} |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4730 |
break; |
2267 | 4731 |
case ZONE_ATTR_INITNAME: |
4732 |
size = strlen(zone->zone_initname) + 1; |
|
4733 |
if (bufsize > size) |
|
4734 |
bufsize = size; |
|
4735 |
if (buf != NULL) { |
|
4736 |
err = copyoutstr(zone->zone_initname, buf, bufsize, |
|
4737 |
NULL); |
|
4738 |
if (err != 0 && err != ENAMETOOLONG) |
|
4739 |
error = EFAULT; |
|
4740 |
} |
|
4741 |
break; |
|
4742 |
case ZONE_ATTR_BOOTARGS: |
|
4743 |
if (zone->zone_bootargs == NULL) |
|
4744 |
outstr = ""; |
|
4745 |
else |
|
4746 |
outstr = zone->zone_bootargs; |
|
4747 |
size = strlen(outstr) + 1; |
|
4748 |
if (bufsize > size) |
|
4749 |
bufsize = size; |
|
4750 |
if (buf != NULL) { |
|
4751 |
err = copyoutstr(outstr, buf, bufsize, NULL); |
|
4752 |
if (err != 0 && err != ENAMETOOLONG) |
|
4753 |
error = EFAULT; |
|
4754 |
} |
|
4755 |
break; |
|
3247 | 4756 |
case ZONE_ATTR_PHYS_MCAP: |
4757 |
size = sizeof (zone->zone_phys_mcap); |
|
4758 |
if (bufsize > size) |
|
4759 |
bufsize = size; |
|
4760 |
if (buf != NULL && |
|
4761 |
copyout(&zone->zone_phys_mcap, buf, bufsize) != 0) |
|
4762 |
error = EFAULT; |
|
4763 |
break; |
|
4764 |
case ZONE_ATTR_SCHED_CLASS: |
|
4765 |
mutex_enter(&class_lock); |
|
4766 |
||
4767 |
if (zone->zone_defaultcid >= loaded_classes) |
|
4768 |
outstr = ""; |
|
4769 |
else |
|
4770 |
outstr = sclass[zone->zone_defaultcid].cl_name; |
|
4771 |
size = strlen(outstr) + 1; |
|
4772 |
if (bufsize > size) |
|
4773 |
bufsize = size; |
|
4774 |
if (buf != NULL) { |
|
4775 |
err = copyoutstr(outstr, buf, bufsize, NULL); |
|
4776 |
if (err != 0 && err != ENAMETOOLONG) |
|
4777 |
error = EFAULT; |
|
4778 |
} |
|
4779 |
||
4780 |
mutex_exit(&class_lock); |
|
4781 |
break; |
|
8662
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4782 |
case ZONE_ATTR_HOSTID: |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4783 |
if (zone->zone_hostid != HW_INVALID_HOSTID && |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4784 |
bufsize == sizeof (zone->zone_hostid)) { |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4785 |
size = sizeof (zone->zone_hostid); |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4786 |
if (buf != NULL && copyout(&zone->zone_hostid, buf, |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4787 |
bufsize) != 0) |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4788 |
error = EFAULT; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4789 |
} else { |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4790 |
error = EINVAL; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4791 |
} |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4792 |
break; |
0 | 4793 |
default: |
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4794 |
if ((attr >= ZONE_ATTR_BRAND_ATTRS) && ZONE_IS_BRANDED(zone)) { |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4795 |
size = bufsize; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4796 |
error = ZBROP(zone)->b_getattr(zone, attr, buf, &size); |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4797 |
} else { |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4798 |
error = EINVAL; |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4799 |
} |
0 | 4800 |
} |
4801 |
zone_rele(zone); |
|
4802 |
||
4803 |
if (error) |
|
4804 |
return (set_errno(error)); |
|
4805 |
return ((ssize_t)size); |
|
4806 |
} |
|
4807 |
||
4808 |
/* |
|
2267 | 4809 |
* Systemcall entry point for zone_setattr(2). |
4810 |
*/ |
|
4811 |
/*ARGSUSED*/ |
|
4812 |
static int |
|
4813 |
zone_setattr(zoneid_t zoneid, int attr, void *buf, size_t bufsize) |
|
4814 |
{ |
|
4815 |
zone_t *zone; |
|
4816 |
zone_status_t zone_status; |
|
4817 |
int err; |
|
4818 |
||
4819 |
if (secpolicy_zone_config(CRED()) != 0) |
|
4820 |
return (set_errno(EPERM)); |
|
4821 |
||
4822 |
/* |
|
3247 | 4823 |
* Only the ZONE_ATTR_PHYS_MCAP attribute can be set on the |
4824 |
* global zone. |
|
2267 | 4825 |
*/ |
3247 | 4826 |
if (zoneid == GLOBAL_ZONEID && attr != ZONE_ATTR_PHYS_MCAP) { |
2267 | 4827 |
return (set_errno(EINVAL)); |
4828 |
} |
|
4829 |
||
4830 |
mutex_enter(&zonehash_lock); |
|
4831 |
if ((zone = zone_find_all_by_id(zoneid)) == NULL) { |
|
4832 |
mutex_exit(&zonehash_lock); |
|
4833 |
return (set_errno(EINVAL)); |
|
4834 |
} |
|
4835 |
zone_hold(zone); |
|
4836 |
mutex_exit(&zonehash_lock); |
|
4837 |
||
3247 | 4838 |
/* |
4839 |
* At present most attributes can only be set on non-running, |
|
4840 |
* non-global zones. |
|
4841 |
*/ |
|
2267 | 4842 |
zone_status = zone_status_get(zone); |
3247 | 4843 |
if (attr != ZONE_ATTR_PHYS_MCAP && zone_status > ZONE_IS_READY) |
2267 | 4844 |
goto done; |
4845 |
||
4846 |
switch (attr) { |
|
4847 |
case ZONE_ATTR_INITNAME: |
|
4848 |
err = zone_set_initname(zone, (const char *)buf); |
|
4849 |
break; |
|
4850 |
case ZONE_ATTR_BOOTARGS: |
|
4851 |
err = zone_set_bootargs(zone, (const char *)buf); |
|
4852 |
break; |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4853 |
case ZONE_ATTR_BRAND: |
4141
ddd21f3d4066
6545740 sparc brandz syscall wrappers only exist on DEBUG kernels
edp
parents:
3916
diff
changeset
|
4854 |
err = zone_set_brand(zone, (const char *)buf); |
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4855 |
break; |
3247 | 4856 |
case ZONE_ATTR_PHYS_MCAP: |
4857 |
err = zone_set_phys_mcap(zone, (const uint64_t *)buf); |
|
4858 |
break; |
|
4859 |
case ZONE_ATTR_SCHED_CLASS: |
|
4860 |
err = zone_set_sched_class(zone, (const char *)buf); |
|
4861 |
break; |
|
8662
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4862 |
case ZONE_ATTR_HOSTID: |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4863 |
if (bufsize == sizeof (zone->zone_hostid)) { |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4864 |
if (copyin(buf, &zone->zone_hostid, bufsize) == 0) |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4865 |
err = 0; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4866 |
else |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4867 |
err = EFAULT; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4868 |
} else { |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4869 |
err = EINVAL; |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4870 |
} |
18153249ee93
PSARC/2008/647 Configurable Hostids for Non-Global Zones
jv227347 <Jordan.Vaughan@Sun.com>
parents:
8364
diff
changeset
|
4871 |
break; |
2267 | 4872 |
default: |
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4873 |
if ((attr >= ZONE_ATTR_BRAND_ATTRS) && ZONE_IS_BRANDED(zone)) |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4874 |
err = ZBROP(zone)->b_setattr(zone, attr, buf, bufsize); |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4875 |
else |
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
4876 |
err = EINVAL; |
2267 | 4877 |
} |
4878 |
||
4879 |
done: |
|
4880 |
zone_rele(zone); |
|
4881 |
return (err != 0 ? set_errno(err) : 0); |
|
4882 |
} |
|
4883 |
||
4884 |
/* |
|
0 | 4885 |
* Return zero if the process has at least one vnode mapped in to its |
4886 |
* address space which shouldn't be allowed to change zones. |
|
3247 | 4887 |
* |
4888 |
* Also return zero if the process has any shared mappings which reserve |
|
4889 |
* swap. This is because the counting for zone.max-swap does not allow swap |
|
5331 | 4890 |
* reservation to be shared between zones. zone swap reservation is counted |
3247 | 4891 |
* on zone->zone_max_swap. |
0 | 4892 |
*/ |
4893 |
static int |
|
4894 |
as_can_change_zones(void) |
|
4895 |
{ |
|
4896 |
proc_t *pp = curproc; |
|
4897 |
struct seg *seg; |
|
4898 |
struct as *as = pp->p_as; |
|
4899 |
vnode_t *vp; |
|
4900 |
int allow = 1; |
|
4901 |
||
4902 |
ASSERT(pp->p_as != &kas); |
|
3247 | 4903 |
AS_LOCK_ENTER(as, &as->a_lock, RW_READER); |
0 | 4904 |
for (seg = AS_SEGFIRST(as); seg != NULL; seg = AS_SEGNEXT(as, seg)) { |
3247 | 4905 |
|
4906 |
/* |
|
4907 |
* Cannot enter zone with shared anon memory which |
|
4908 |
* reserves swap. See comment above. |
|
4909 |
*/ |
|
4910 |
if (seg_can_change_zones(seg) == B_FALSE) { |
|
4911 |
allow = 0; |
|
4912 |
break; |
|
4913 |
} |
|
0 | 4914 |
/* |
4915 |
* if we can't get a backing vnode for this segment then skip |
|
4916 |
* it. |
|
4917 |
*/ |
|
4918 |
vp = NULL; |
|
4919 |
if (SEGOP_GETVP(seg, seg->s_base, &vp) != 0 || vp == NULL) |
|
4920 |
continue; |
|
4921 |
if (!vn_can_change_zones(vp)) { /* bail on first match */ |
|
4922 |
allow = 0; |
|
4923 |
break; |
|
4924 |
} |
|
4925 |
} |
|
3247 | 4926 |
AS_LOCK_EXIT(as, &as->a_lock); |
0 | 4927 |
return (allow); |
4928 |
} |
|
4929 |
||
4930 |
/* |
|
3247 | 4931 |
* Count swap reserved by curproc's address space |
4932 |
*/ |
|
4933 |
static size_t |
|
4934 |
as_swresv(void) |
|
4935 |
{ |
|
4936 |
proc_t *pp = curproc; |
|
4937 |
struct seg *seg; |
|
4938 |
struct as *as = pp->p_as; |
|
4939 |
size_t swap = 0; |
|
4940 |
||
4941 |
ASSERT(pp->p_as != &kas); |
|
4942 |
ASSERT(AS_WRITE_HELD(as, &as->a_lock)); |
|
4943 |
for (seg = AS_SEGFIRST(as); seg != NULL; seg = AS_SEGNEXT(as, seg)) |
|
4944 |
swap += seg_swresv(seg); |
|
4945 |
||
4946 |
return (swap); |
|
4947 |
} |
|
4948 |
||
4949 |
/* |
|
0 | 4950 |
* Systemcall entry point for zone_enter(). |
4951 |
* |
|
4952 |
* The current process is injected into said zone. In the process |
|
4953 |
* it will change its project membership, privileges, rootdir/cwd, |
|
4954 |
* zone-wide rctls, and pool association to match those of the zone. |
|
4955 |
* |
|
4956 |
* The first zone_enter() called while the zone is in the ZONE_IS_READY |
|
4957 |
* state will transition it to ZONE_IS_RUNNING. Processes may only |
|
4958 |
* enter a zone that is "ready" or "running". |
|
4959 |
*/ |
|
4960 |
static int |
|
4961 |
zone_enter(zoneid_t zoneid) |
|
4962 |
{ |
|
4963 |
zone_t *zone; |
|
4964 |
vnode_t *vp; |
|
4965 |
proc_t *pp = curproc; |
|
4966 |
contract_t *ct; |
|
4967 |
cont_process_t *ctp; |
|
4968 |
task_t *tk, *oldtk; |
|
4969 |
kproject_t *zone_proj0; |
|
4970 |
cred_t *cr, *newcr; |
|
4971 |
pool_t *oldpool, *newpool; |
|
4972 |
sess_t *sp; |
|
4973 |
uid_t uid; |
|
4974 |
zone_status_t status; |
|
4975 |
int err = 0; |
|
4976 |
rctl_entity_p_t e; |
|
3247 | 4977 |
size_t swap; |
3792 | 4978 |
kthread_id_t t; |
0 | 4979 |
|
4980 |
if (secpolicy_zone_config(CRED()) != 0) |
|
4981 |
return (set_errno(EPERM)); |
|
4982 |
if (zoneid < MIN_USERZONEID || zoneid > MAX_ZONEID) |
|
4983 |
return (set_errno(EINVAL)); |
|
4984 |
||
4985 |
/* |
|
4986 |
* Stop all lwps so we don't need to hold a lock to look at |
|
4987 |
* curproc->p_zone. This needs to happen before we grab any |
|
4988 |
* locks to avoid deadlock (another lwp in the process could |
|
4989 |
* be waiting for the held lock). |
|
4990 |
*/ |
|
4991 |
if (curthread != pp->p_agenttp && !holdlwps(SHOLDFORK)) |
|
4992 |
return (set_errno(EINTR)); |
|
4993 |
||
4994 |
/* |
|
4995 |
* Make sure we're not changing zones with files open or mapped in |
|
4996 |
* to our address space which shouldn't be changing zones. |
|
4997 |
*/ |
|
4998 |
if (!files_can_change_zones()) { |
|
4999 |
err = EBADF; |
|
5000 |
goto out; |
|
5001 |
} |
|
5002 |
if (!as_can_change_zones()) { |
|
5003 |
err = EFAULT; |
|
5004 |
goto out; |
|
5005 |
} |
|
5006 |
||
5007 |
mutex_enter(&zonehash_lock); |
|
5008 |
if (pp->p_zone != global_zone) { |
|
5009 |
mutex_exit(&zonehash_lock); |
|
5010 |
err = EINVAL; |
|
5011 |
goto out; |
|
5012 |
} |
|
5013 |
||
5014 |
zone = zone_find_all_by_id(zoneid); |
|
5015 |
if (zone == NULL) { |
|
5016 |
mutex_exit(&zonehash_lock); |
|
5017 |
err = EINVAL; |
|
5018 |
goto out; |
|
5019 |
} |
|
5020 |
||
5021 |
/* |
|
5022 |
* To prevent processes in a zone from holding contracts on |
|
5023 |
* extrazonal resources, and to avoid process contract |
|
5024 |
* memberships which span zones, contract holders and processes |
|
5025 |
* which aren't the sole members of their encapsulating process |
|
5026 |
* contracts are not allowed to zone_enter. |
|
5027 |
*/ |
|
5028 |
ctp = pp->p_ct_process; |
|
5029 |
ct = &ctp->conp_contract; |
|
5030 |
mutex_enter(&ct->ct_lock); |
|
5031 |
mutex_enter(&pp->p_lock); |
|
5032 |
if ((avl_numnodes(&pp->p_ct_held) != 0) || (ctp->conp_nmembers != 1)) { |
|
5033 |
mutex_exit(&pp->p_lock); |
|
5034 |
mutex_exit(&ct->ct_lock); |
|
5035 |
mutex_exit(&zonehash_lock); |
|
5036 |
err = EINVAL; |
|
5037 |
goto out; |
|
5038 |
} |
|
5039 |
||
5040 |
/* |
|
5041 |
* Moreover, we don't allow processes whose encapsulating |
|
5042 |
* process contracts have inherited extrazonal contracts. |
|
5043 |
* While it would be easier to eliminate all process contracts |
|
5044 |
* with inherited contracts, we need to be able to give a |
|
5045 |
* restarted init (or other zone-penetrating process) its |
|
5046 |
* predecessor's contracts. |
|
5047 |
*/ |
|
5048 |
if (ctp->conp_ninherited != 0) { |
|
5049 |
contract_t *next; |
|
5050 |
for (next = list_head(&ctp->conp_inherited); next; |
|
5051 |
next = list_next(&ctp->conp_inherited, next)) { |
|
5052 |
if (contract_getzuniqid(next) != zone->zone_uniqid) { |
|
5053 |
mutex_exit(&pp->p_lock); |
|
5054 |
mutex_exit(&ct->ct_lock); |
|
5055 |
mutex_exit(&zonehash_lock); |
|
5056 |
err = EINVAL; |
|
5057 |
goto out; |
|
5058 |
} |
|
5059 |
} |
|
5060 |
} |
|
6073 | 5061 |
|
0 | 5062 |
mutex_exit(&pp->p_lock); |
5063 |
mutex_exit(&ct->ct_lock); |
|
5064 |
||
5065 |
status = zone_status_get(zone); |
|
5066 |
if (status < ZONE_IS_READY || status >= ZONE_IS_SHUTTING_DOWN) { |
|
5067 |
/* |
|
5068 |
* Can't join |
|
5069 |
*/ |
|
5070 |
mutex_exit(&zonehash_lock); |
|
5071 |
err = EINVAL; |
|
5072 |
goto out; |
|
5073 |
} |
|
5074 |
||
5075 |
/* |
|
5076 |
* Make sure new priv set is within the permitted set for caller |
|
5077 |
*/ |
|
5078 |
if (!priv_issubset(zone->zone_privset, &CR_OPPRIV(CRED()))) { |
|
5079 |
mutex_exit(&zonehash_lock); |
|
5080 |
err = EPERM; |
|
5081 |
goto out; |
|
5082 |
} |
|
5083 |
/* |
|
5084 |
* We want to momentarily drop zonehash_lock while we optimistically |
|
5085 |
* bind curproc to the pool it should be running in. This is safe |
|
5086 |
* since the zone can't disappear (we have a hold on it). |
|
5087 |
*/ |
|
5088 |
zone_hold(zone); |
|
5089 |
mutex_exit(&zonehash_lock); |
|
5090 |
||
5091 |
/* |
|
5092 |
* Grab pool_lock to keep the pools configuration from changing |
|
5093 |
* and to stop ourselves from getting rebound to another pool |
|
5094 |
* until we join the zone. |
|
5095 |
*/ |
|
5096 |
if (pool_lock_intr() != 0) { |
|
5097 |
zone_rele(zone); |
|
5098 |
err = EINTR; |
|
5099 |
goto out; |
|
5100 |
} |
|
5101 |
ASSERT(secpolicy_pool(CRED()) == 0); |
|
5102 |
/* |
|
5103 |
* Bind ourselves to the pool currently associated with the zone. |
|
5104 |
*/ |
|
5105 |
oldpool = curproc->p_pool; |
|
5106 |
newpool = zone_pool_get(zone); |
|
5107 |
if (pool_state == POOL_ENABLED && newpool != oldpool && |
|
5108 |
(err = pool_do_bind(newpool, P_PID, P_MYID, |
|
5109 |
POOL_BIND_ALL)) != 0) { |
|
5110 |
pool_unlock(); |
|
5111 |
zone_rele(zone); |
|
5112 |
goto out; |
|
5113 |
} |
|
5114 |
||
5115 |
/* |
|
5116 |
* Grab cpu_lock now; we'll need it later when we call |
|
5117 |
* task_join(). |
|
5118 |
*/ |
|
5119 |
mutex_enter(&cpu_lock); |
|
5120 |
mutex_enter(&zonehash_lock); |
|
5121 |
/* |
|
5122 |
* Make sure the zone hasn't moved on since we dropped zonehash_lock. |
|
5123 |
*/ |
|
5124 |
if (zone_status_get(zone) >= ZONE_IS_SHUTTING_DOWN) { |
|
5125 |
/* |
|
5126 |
* Can't join anymore. |
|
5127 |
*/ |
|
5128 |
mutex_exit(&zonehash_lock); |
|
5129 |
mutex_exit(&cpu_lock); |
|
5130 |
if (pool_state == POOL_ENABLED && |
|
5131 |
newpool != oldpool) |
|
5132 |
(void) pool_do_bind(oldpool, P_PID, P_MYID, |
|
5133 |
POOL_BIND_ALL); |
|
5134 |
pool_unlock(); |
|
5135 |
zone_rele(zone); |
|
5136 |
err = EINVAL; |
|
5137 |
goto out; |
|
5138 |
} |
|
5139 |
||
3247 | 5140 |
/* |
5141 |
* a_lock must be held while transfering locked memory and swap |
|
5142 |
* reservation from the global zone to the non global zone because |
|
5143 |
* asynchronous faults on the processes' address space can lock |
|
5144 |
* memory and reserve swap via MCL_FUTURE and MAP_NORESERVE |
|
5145 |
* segments respectively. |
|
5146 |
*/ |
|
5147 |
AS_LOCK_ENTER(pp->as, &pp->p_as->a_lock, RW_WRITER); |
|
5148 |
swap = as_swresv(); |
|
0 | 5149 |
mutex_enter(&pp->p_lock); |
5150 |
zone_proj0 = zone->zone_zsched->p_task->tk_proj; |
|
5151 |
/* verify that we do not exceed and task or lwp limits */ |
|
5152 |
mutex_enter(&zone->zone_nlwps_lock); |
|
5153 |
/* add new lwps to zone and zone's proj0 */ |
|
5154 |
zone_proj0->kpj_nlwps += pp->p_lwpcnt; |
|
5155 |
zone->zone_nlwps += pp->p_lwpcnt; |
|
5156 |
/* add 1 task to zone's proj0 */ |
|
5157 |
zone_proj0->kpj_ntasks += 1; |
|
5158 |
mutex_exit(&zone->zone_nlwps_lock); |
|
5159 |
||
3247 | 5160 |
mutex_enter(&zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5161 |
zone->zone_locked_mem += pp->p_locked_mem; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5162 |
zone_proj0->kpj_data.kpd_locked_mem += pp->p_locked_mem; |
3247 | 5163 |
zone->zone_max_swap += swap; |
5164 |
mutex_exit(&zone->zone_mem_lock); |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5165 |
|
3916 | 5166 |
mutex_enter(&(zone_proj0->kpj_data.kpd_crypto_lock)); |
5167 |
zone_proj0->kpj_data.kpd_crypto_mem += pp->p_crypto_mem; |
|
5168 |
mutex_exit(&(zone_proj0->kpj_data.kpd_crypto_lock)); |
|
5169 |
||
0 | 5170 |
/* remove lwps from proc's old zone and old project */ |
5171 |
mutex_enter(&pp->p_zone->zone_nlwps_lock); |
|
5172 |
pp->p_zone->zone_nlwps -= pp->p_lwpcnt; |
|
5173 |
pp->p_task->tk_proj->kpj_nlwps -= pp->p_lwpcnt; |
|
5174 |
mutex_exit(&pp->p_zone->zone_nlwps_lock); |
|
5175 |
||
3247 | 5176 |
mutex_enter(&pp->p_zone->zone_mem_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5177 |
pp->p_zone->zone_locked_mem -= pp->p_locked_mem; |
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5178 |
pp->p_task->tk_proj->kpj_data.kpd_locked_mem -= pp->p_locked_mem; |
3247 | 5179 |
pp->p_zone->zone_max_swap -= swap; |
5180 |
mutex_exit(&pp->p_zone->zone_mem_lock); |
|
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5181 |
|
3916 | 5182 |
mutex_enter(&(pp->p_task->tk_proj->kpj_data.kpd_crypto_lock)); |
5183 |
pp->p_task->tk_proj->kpj_data.kpd_crypto_mem -= pp->p_crypto_mem; |
|
5184 |
mutex_exit(&(pp->p_task->tk_proj->kpj_data.kpd_crypto_lock)); |
|
5185 |
||
9121
f83e5a35a5da
6810086 panic in rctl_incr_swap() due to freed up proc structure
Vamsi Nagineni <Vamsi.Krishna@Sun.COM>
parents:
8905
diff
changeset
|
5186 |
pp->p_flag |= SZONETOP; |
f83e5a35a5da
6810086 panic in rctl_incr_swap() due to freed up proc structure
Vamsi Nagineni <Vamsi.Krishna@Sun.COM>
parents:
8905
diff
changeset
|
5187 |
pp->p_zone = zone; |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5188 |
mutex_exit(&pp->p_lock); |
3247 | 5189 |
AS_LOCK_EXIT(pp->p_as, &pp->p_as->a_lock); |
2768
3c77434a8dbb
PSARC/2004/580 zone/project.max-locked-memory Resource Controls
sl108498
parents:
2712
diff
changeset
|
5190 |
|
0 | 5191 |
/* |
5192 |
* Joining the zone cannot fail from now on. |
|
5193 |
* |
|
5194 |
* This means that a lot of the following code can be commonized and |
|
5195 |
* shared with zsched(). |
|
5196 |
*/ |
|
5197 |
||
5198 |
/* |
|
6073 | 5199 |
* If the process contract fmri was inherited, we need to |
5200 |
* flag this so that any contract status will not leak |
|
5201 |
* extra zone information, svc_fmri in this case |
|
5202 |
*/ |
|
5203 |
if (ctp->conp_svc_ctid != ct->ct_id) { |
|
5204 |
mutex_enter(&ct->ct_lock); |
|
5205 |
ctp->conp_svc_zone_enter = ct->ct_id; |
|
5206 |
mutex_exit(&ct->ct_lock); |
|
5207 |
} |
|
5208 |
||
5209 |
/* |
|
0 | 5210 |
* Reset the encapsulating process contract's zone. |
5211 |
*/ |
|
5212 |
ASSERT(ct->ct_mzuniqid == GLOBAL_ZONEUNIQID); |
|
5213 |
contract_setzuniqid(ct, zone->zone_uniqid); |
|
5214 |
||
5215 |
/* |
|
5216 |
* Create a new task and associate the process with the project keyed |
|
5217 |
* by (projid,zoneid). |
|
5218 |
* |
|
5219 |
* We might as well be in project 0; the global zone's projid doesn't |
|
5220 |
* make much sense in a zone anyhow. |
|
5221 |
* |
|
5222 |
* This also increments zone_ntasks, and returns with p_lock held. |
|
5223 |
*/ |
|
5224 |
tk = task_create(0, zone); |
|
5225 |
oldtk = task_join(tk, 0); |
|
5226 |
mutex_exit(&cpu_lock); |
|
5227 |
||
5228 |
/* |
|
5229 |
* call RCTLOP_SET functions on this proc |
|
5230 |
*/ |
|
5231 |
e.rcep_p.zone = zone; |
|
5232 |
e.rcep_t = RCENTITY_ZONE; |
|
5233 |
(void) rctl_set_dup(NULL, NULL, pp, &e, zone->zone_rctls, NULL, |
|
5234 |
RCD_CALLBACK); |
|
5235 |
mutex_exit(&pp->p_lock); |
|
5236 |
||
5237 |
/* |
|
5238 |
* We don't need to hold any of zsched's locks here; not only do we know |
|
5239 |
* the process and zone aren't going away, we know its session isn't |
|
5240 |
* changing either. |
|
5241 |
* |
|
5242 |
* By joining zsched's session here, we mimic the behavior in the |
|
5243 |
* global zone of init's sid being the pid of sched. We extend this |
|
5244 |
* to all zlogin-like zone_enter()'ing processes as well. |
|
5245 |
*/ |
|
5246 |
mutex_enter(&pidlock); |
|
5247 |
sp = zone->zone_zsched->p_sessp; |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
5248 |
sess_hold(zone->zone_zsched); |
0 | 5249 |
mutex_enter(&pp->p_lock); |
5250 |
pgexit(pp); |
|
2712
f74a135872bc
PSARC/2005/471 BrandZ: Support for non-native zones
nn35248
parents:
2677
diff
changeset
|
5251 |
sess_rele(pp->p_sessp, B_TRUE); |
0 | 5252 |
pp->p_sessp = sp; |
5253 |
pgjoin(pp, zone->zone_zsched->p_pidp); |
|
3247 | 5254 |
|
5255 |
/* |
|
3792 | 5256 |
* If any threads are scheduled to be placed on zone wait queue they |
5257 |
* should abandon the idea since the wait queue is changing. |
|
5258 |
* We need to be holding pidlock & p_lock to do this. |
|
5259 |
*/ |
|
5260 |
if ((t = pp->p_tlist) != NULL) { |
|
5261 |
do { |
|
5262 |
thread_lock(t); |
|
5263 |
/* |
|
5264 |
* Kick this thread so that he doesn't sit |
|
5265 |
* on a wrong wait queue. |
|
5266 |
*/ |
|
5267 |
if (ISWAITING(t)) |
|
5268 |
setrun_locked(t); |
|
5269 |
||
5270 |
if (t->t_schedflag & TS_ANYWAITQ) |
|
5271 |
t->t_schedflag &= ~ TS_ANYWAITQ; |
|
5272 |
||
5273 |
thread_unlock(t); |
|
5274 |
} while ((t = t->t_forw) != pp->p_tlist); |
|
5275 |
} |
|
5276 |
||
5277 |
/* |
|
3247 | 5278 |
* If there is a default scheduling class for the zone and it is not |
5279 |
* the class we are currently in, change all of the threads in the |
|
5280 |
* process to the new class. We need to be holding pidlock & p_lock |
|
5281 |
* when we call parmsset so this is a good place to do it. |
|
5282 |
*/ |
|
5283 |
if (zone->zone_defaultcid > 0 && |
|
5284 |
zone->zone_defaultcid != curthread->t_cid) { |
|
5285 |
pcparms_t pcparms; |
|
5286 |
||
5287 |
pcparms.pc_cid = zone->zone_defaultcid; |
|
5288 |
pcparms.pc_clparms[0] = 0; |
|
5289 |
||
5290 |
/* |
|
5291 |
* If setting the class fails, we still want to enter the zone. |
|
5292 |
*/ |
|
5293 |
if ((t = pp->p_tlist) != NULL) { |
|
5294 |
do { |
|
5295 |
(void) parmsset(&pcparms, t); |
|
5296 |
} while ((t = t->t_forw) != pp->p_tlist); |
|
5297 |
} |
|
5298 |
} |
|
5299 |
||
0 | 5300 |
mutex_exit(&pp->p_lock); |
5301 |
mutex_exit(&pidlock); |
|
5302 |
||
5303 |
mutex_exit(&zonehash_lock); |
|
5304 |
/* |
|
5305 |
* We're firmly in the zone; let pools progress. |
|
5306 |
*/ |
|
5307 |
pool_unlock(); |
|
5308 |
task_rele(oldtk); |
|
5309 |
/* |
|
5310 |
* We don't need to retain a hold on the zone since we already |
|
5311 |
* incremented zone_ntasks, so the zone isn't going anywhere. |
|
5312 |
*/ |
|
5313 |
zone_rele(zone); |
|
5314 |
||
5315 |
/* |
|
5316 |
* Chroot |
|
5317 |
*/ |
|
5318 |
vp = zone->zone_rootvp; |
|
5319 |
zone_chdir(vp, &PTOU(pp)->u_cdir, pp); |
|
5320 |
zone_chdir(vp, &PTOU(pp)->u_rdir, pp); |
|
5321 |
||
5322 |
/* |
|
5323 |
* Change process credentials |
|
5324 |
*/ |
|
5325 |
newcr = cralloc(); |
|
5326 |
mutex_enter(&pp->p_crlock); |
|
5327 |
cr = pp->p_cred; |
|
5328 |
crcopy_to(cr, newcr); |
|
5329 |
crsetzone(newcr, zone); |
|
5330 |
pp->p_cred = newcr; |
|
5331 |
||
5332 |
/* |
|
5333 |
* Restrict all process privilege sets to zone limit |
|
5334 |
*/ |
|
5335 |
priv_intersect(zone->zone_privset, &CR_PPRIV(newcr)); |
|
5336 |
priv_intersect(zone->zone_privset, &CR_EPRIV(newcr)); |
|
5337 |
priv_intersect(zone->zone_privset, &CR_IPRIV(newcr)); |
|
5338 |
priv_intersect(zone->zone_privset, &CR_LPRIV(newcr)); |
|
5339 |
mutex_exit(&pp->p_crlock); |
|
5340 |
crset(pp, newcr); |
|
5341 |
||
5342 |
/* |
|
5343 |
* Adjust upcount to reflect zone entry. |
|
5344 |
*/ |
|
5345 |
uid = crgetruid(newcr); |
|
5346 |
mutex_enter(&pidlock); |
|
5347 |
upcount_dec(uid, GLOBAL_ZONEID); |
|
5348 |
upcount_inc(uid, zoneid); |
|
5349 |
mutex_exit(&pidlock); |
|
5350 |
||
5351 |
/* |
|
5352 |
* Set up core file path and content. |
|
5353 |
*/ |
|
5354 |
set_core_defaults(); |
|
5355 |
||
5356 |
out: |
|
5357 |
/* |
|
5358 |
* Let the other lwps continue. |
|
5359 |
*/ |
|
5360 |
mutex_enter(&pp->p_lock); |
|
5361 |
if (curthread != pp->p_agenttp) |
|
5362 |
continuelwps(pp); |
|
5363 |
mutex_exit(&pp->p_lock); |
|
5364 |
||
5365 |
return (err != 0 ? set_errno(err) : 0); |
|
5366 |
} |
|
5367 |
||
5368 |
/* |
|
5369 |
* Systemcall entry point for zone_list(2). |
|
5370 |
* |
|
5371 |
* Processes running in a (non-global) zone only see themselves. |
|
1676 | 5372 |
* On labeled systems, they see all zones whose label they dominate. |
0 | 5373 |
*/ |
5374 |
static int |
|
5375 |
zone_list(zoneid_t *zoneidlist, uint_t *numzones) |
|
5376 |
{ |
|
5377 |
zoneid_t *zoneids; |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5378 |
zone_t *zone, *myzone; |
0 | 5379 |
uint_t user_nzones, real_nzones; |
1676 | 5380 |
uint_t domi_nzones; |
5381 |
int error; |
|
0 | 5382 |
|
5383 |
if (copyin(numzones, &user_nzones, sizeof (uint_t)) != 0) |
|
5384 |
return (set_errno(EFAULT)); |
|
5385 |
||
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5386 |
myzone = curproc->p_zone; |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5387 |
if (myzone != global_zone) { |
1676 | 5388 |
bslabel_t *mybslab; |
5389 |
||
5390 |
if (!is_system_labeled()) { |
|
5391 |
/* just return current zone */ |
|
5392 |
real_nzones = domi_nzones = 1; |
|
5393 |
zoneids = kmem_alloc(sizeof (zoneid_t), KM_SLEEP); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5394 |
zoneids[0] = myzone->zone_id; |
1676 | 5395 |
} else { |
5396 |
/* return all zones that are dominated */ |
|
5397 |
mutex_enter(&zonehash_lock); |
|
5398 |
real_nzones = zonecount; |
|
5399 |
domi_nzones = 0; |
|
5400 |
if (real_nzones > 0) { |
|
5401 |
zoneids = kmem_alloc(real_nzones * |
|
5402 |
sizeof (zoneid_t), KM_SLEEP); |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5403 |
mybslab = label2bslabel(myzone->zone_slabel); |
1676 | 5404 |
for (zone = list_head(&zone_active); |
5405 |
zone != NULL; |
|
5406 |
zone = list_next(&zone_active, zone)) { |
|
5407 |
if (zone->zone_id == GLOBAL_ZONEID) |
|
5408 |
continue; |
|
1769
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5409 |
if (zone != myzone && |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5410 |
(zone->zone_flags & ZF_IS_SCRATCH)) |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5411 |
continue; |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5412 |
/* |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5413 |
* Note that a label always dominates |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5414 |
* itself, so myzone is always included |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5415 |
* in the list. |
338500d67d4f
6404654 zoneadm mount command fails on labeled systems
carlsonj
parents:
1676
diff
changeset
|
5416 |
*/ |
1676 | 5417 |
if (bldominates(mybslab, |
5418 |
label2bslabel(zone->zone_slabel))) { |
|
5419 |
zoneids[domi_nzones++] = |
|
5420 |
zone->zone_id; |
|
5421 |
} |
|
5422 |
} |
|
5423 |
} |
|
5424 |
mutex_exit(&zonehash_lock); |
|
5425 |
} |
|
0 | 5426 |
} else { |
5427 |
mutex_enter(&zonehash_lock); |
|
5428 |
real_nzones = zonecount; |
|
1676 | 5429 |
domi_nzones = 0; |
5430 |
if (real_nzones > 0) { |
|
0 | 5431 |
zoneids = kmem_alloc(real_nzones * sizeof (zoneid_t), |
5432 |
KM_SLEEP); |
|
5433 |
for (zone = list_head(&zone_active); zone != NULL; |
|
5434 |
zone = list_next(&zone_active, zone)) |
|
1676 | 5435 |
zoneids[domi_nzones++] = zone->zone_id; |
5436 |
ASSERT(domi_nzones == real_nzones); |
|
0 | 5437 |
} |
5438 |
mutex_exit(&zonehash_lock); |
|
5439 |
} |
|
5440 |
||
1676 | 5441 |
/* |
5442 |
* If user has allocated space for fewer entries than we found, then |
|
5443 |
* return only up to his limit. Either way, tell him exactly how many |
|
5444 |
* we found. |
|
5445 |
*/ |
|
5446 |
if (domi_nzones < user_nzones) |
|
5447 |
user_nzones = domi_nzones; |
|
5448 |
error = 0; |
|
5449 |
if (copyout(&domi_nzones, numzones, sizeof (uint_t)) != 0) { |
|
0 | 5450 |
error = EFAULT; |
1676 | 5451 |
} else if (zoneidlist != NULL && user_nzones != 0) { |
0 | 5452 |
if (copyout(zoneids, zoneidlist, |
5453 |
user_nzones * sizeof (zoneid_t)) != 0) |
|
5454 |
error = EFAULT; |
|
5455 |
} |
|
5456 |
||
1676 | 5457 |
if (real_nzones > 0) |
0 | 5458 |
kmem_free(zoneids, real_nzones * sizeof (zoneid_t)); |
5459 |
||
1676 | 5460 |
if (error != 0) |
0 | 5461 |
return (set_errno(error)); |
5462 |
else |
|
5463 |
return (0); |
|
5464 |
} |
|
5465 |
||
5466 |
/* |
|
5467 |
* Systemcall entry point for zone_lookup(2). |
|
5468 |
* |
|
1676 | 5469 |
* Non-global zones are only able to see themselves and (on labeled systems) |
5470 |
* the zones they dominate. |
|
0 | 5471 |
*/ |
5472 |
static zoneid_t |
|
5473 |
zone_lookup(const char *zone_name) |
|
5474 |
{ |
|
5475 |
char *kname; |
|
5476 |
zone_t *zone; |
|
5477 |
zoneid_t zoneid; |
|
5478 |
int err; |
|
5479 |
||
5480 |
if (zone_name == NULL) { |
|
5481 |
/* return caller's zone id */ |
|
5482 |
return (getzoneid()); |
|
5483 |
} |
|
5484 |
||
5485 |
kname = kmem_zalloc(ZONENAME_MAX, KM_SLEEP); |
|
5486 |
if ((err = copyinstr(zone_name, kname, ZONENAME_MAX, NULL)) != 0) { |
|
5487 |
kmem_free(kname, ZONENAME_MAX); |
|
5488 |
return (set_errno(err)); |
|
5489 |
} |
|
5490 |
||
5491 |
mutex_enter(&zonehash_lock); |
|
5492 |
zone = zone_find_all_by_name(kname); |
|
5493 |
kmem_free(kname, ZONENAME_MAX); |
|
1676 | 5494 |
/* |
5495 |
* In a non-global zone, can only lookup global and own name. |
|
5496 |
* In Trusted Extensions zone label dominance rules apply. |
|
5497 |
*/ |
|
5498 |
if (zone == NULL || |
|
5499 |
zone_status_get(zone) < ZONE_IS_READY || |
|
5500 |
!zone_list_access(zone)) { |
|
0 | 5501 |
mutex_exit(&zonehash_lock); |
5502 |
return (set_errno(EINVAL)); |
|
1676 | 5503 |
} else { |
5504 |
zoneid = zone->zone_id; |
|
5505 |
mutex_exit(&zonehash_lock); |
|
5506 |
return (zoneid); |
|
0 | 5507 |
} |
5508 |
} |
|
5509 |
||
813 | 5510 |
static int |
5511 |
zone_version(int *version_arg) |
|
5512 |
{ |
|
5513 |
int version = ZONE_SYSCALL_API_VERSION; |
|
5514 |
||
5515 |
if (copyout(&version, version_arg, sizeof (int)) != 0) |
|
5516 |
return (set_errno(EFAULT)); |
|
5517 |
return (0); |
|
5518 |
} |
|
5519 |
||
0 | 5520 |
/* ARGSUSED */ |
5521 |
long |
|
789 | 5522 |
zone(int cmd, void *arg1, void *arg2, void *arg3, void *arg4) |
0 | 5523 |
{ |
5524 |
zone_def zs; |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5525 |
int err; |
0 | 5526 |
|
5527 |
switch (cmd) { |
|
5528 |
case ZONE_CREATE: |
|
5529 |
if (get_udatamodel() == DATAMODEL_NATIVE) { |
|
5530 |
if (copyin(arg1, &zs, sizeof (zone_def))) { |
|
5531 |
return (set_errno(EFAULT)); |
|
5532 |
} |
|
5533 |
} else { |
|
5534 |
#ifdef _SYSCALL32_IMPL |
|
5535 |
zone_def32 zs32; |
|
5536 |
||
5537 |
if (copyin(arg1, &zs32, sizeof (zone_def32))) { |
|
5538 |
return (set_errno(EFAULT)); |
|
5539 |
} |
|
5540 |
zs.zone_name = |
|
5541 |
(const char *)(unsigned long)zs32.zone_name; |
|
5542 |
zs.zone_root = |
|
5543 |
(const char *)(unsigned long)zs32.zone_root; |
|
5544 |
zs.zone_privs = |
|
5545 |
(const struct priv_set *) |
|
5546 |
(unsigned long)zs32.zone_privs; |
|
1409
c25d6f2622c9
6366674 zones service common name could be more descriptive
dp
parents:
1166
diff
changeset
|
5547 |
zs.zone_privssz = zs32.zone_privssz; |
0 | 5548 |
zs.rctlbuf = (caddr_t)(unsigned long)zs32.rctlbuf; |
5549 |
zs.rctlbufsz = zs32.rctlbufsz; |
|
789 | 5550 |
zs.zfsbuf = (caddr_t)(unsigned long)zs32.zfsbuf; |
5551 |
zs.zfsbufsz = zs32.zfsbufsz; |
|
0 | 5552 |
zs.extended_error = |
5553 |
(int *)(unsigned long)zs32.extended_error; |
|
1676 | 5554 |
zs.match = zs32.match; |
5555 |
zs.doi = zs32.doi; |
|
5556 |
zs.label = (const bslabel_t *)(uintptr_t)zs32.label; |
|
3448 | 5557 |
zs.flags = zs32.flags; |
0 | 5558 |
#else |
5559 |
panic("get_udatamodel() returned bogus result\n"); |
|
5560 |
#endif |
|
5561 |
} |
|
5562 |
||
5563 |
return (zone_create(zs.zone_name, zs.zone_root, |
|
813 | 5564 |
zs.zone_privs, zs.zone_privssz, |
5565 |
(caddr_t)zs.rctlbuf, zs.rctlbufsz, |
|
5566 |
(caddr_t)zs.zfsbuf, zs.zfsbufsz, |
|
1676 | 5567 |
zs.extended_error, zs.match, zs.doi, |
3448 | 5568 |
zs.label, zs.flags)); |
0 | 5569 |
case ZONE_BOOT: |
2267 | 5570 |
return (zone_boot((zoneid_t)(uintptr_t)arg1)); |
0 | 5571 |
case ZONE_DESTROY: |
5572 |
return (zone_destroy((zoneid_t)(uintptr_t)arg1)); |
|
5573 |
case ZONE_GETATTR: |
|
5574 |
return (zone_getattr((zoneid_t)(uintptr_t)arg1, |
|
5575 |
(int)(uintptr_t)arg2, arg3, (size_t)arg4)); |
|
2267 | 5576 |
case ZONE_SETATTR: |
5577 |
return (zone_setattr((zoneid_t)(uintptr_t)arg1, |
|
5578 |
(int)(uintptr_t)arg2, arg3, (size_t)arg4)); |
|
0 | 5579 |
case ZONE_ENTER: |
5580 |
return (zone_enter((zoneid_t)(uintptr_t)arg1)); |
|
5581 |
case ZONE_LIST: |
|
5582 |
return (zone_list((zoneid_t *)arg1, (uint_t *)arg2)); |
|
5583 |
case ZONE_SHUTDOWN: |
|
5584 |
return (zone_shutdown((zoneid_t)(uintptr_t)arg1)); |
|
5585 |
case ZONE_LOOKUP: |
|
5586 |
return (zone_lookup((const char *)arg1)); |
|
813 | 5587 |
case ZONE_VERSION: |
5588 |
return (zone_version((int *)arg1)); |
|
3448 | 5589 |
case ZONE_ADD_DATALINK: |
5590 |
return (zone_add_datalink((zoneid_t)(uintptr_t)arg1, |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5591 |
(datalink_id_t)(uintptr_t)arg2)); |
3448 | 5592 |
case ZONE_DEL_DATALINK: |
5593 |
return (zone_remove_datalink((zoneid_t)(uintptr_t)arg1, |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5594 |
(datalink_id_t)(uintptr_t)arg2)); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5595 |
case ZONE_CHECK_DATALINK: { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5596 |
zoneid_t zoneid; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5597 |
boolean_t need_copyout; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5598 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5599 |
if (copyin(arg1, &zoneid, sizeof (zoneid)) != 0) |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5600 |
return (EFAULT); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5601 |
need_copyout = (zoneid == ALL_ZONES); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5602 |
err = zone_check_datalink(&zoneid, |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5603 |
(datalink_id_t)(uintptr_t)arg2); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5604 |
if (err == 0 && need_copyout) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5605 |
if (copyout(&zoneid, arg1, sizeof (zoneid)) != 0) |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5606 |
err = EFAULT; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5607 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5608 |
return (err == 0 ? 0 : set_errno(err)); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5609 |
} |
3448 | 5610 |
case ZONE_LIST_DATALINK: |
5611 |
return (zone_list_datalink((zoneid_t)(uintptr_t)arg1, |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
5612 |
(int *)arg2, (datalink_id_t *)(uintptr_t)arg3)); |
0 | 5613 |
default: |
5614 |
return (set_errno(EINVAL)); |
|
5615 |
} |
|
5616 |
} |
|
5617 |
||
5618 |
struct zarg { |
|
5619 |
zone_t *zone; |
|
5620 |
zone_cmd_arg_t arg; |
|
5621 |
}; |
|
5622 |
||
5623 |
static int |
|
5624 |
zone_lookup_door(const char *zone_name, door_handle_t *doorp) |
|
5625 |
{ |
|
5626 |
char *buf; |
|
5627 |
size_t buflen; |
|
5628 |
int error; |
|
5629 |
||
5630 |
buflen = sizeof (ZONE_DOOR_PATH) + strlen(zone_name); |
|
5631 |
buf = kmem_alloc(buflen, KM_SLEEP); |
|
5632 |
(void) snprintf(buf, buflen, ZONE_DOOR_PATH, zone_name); |
|
5633 |
error = door_ki_open(buf, doorp); |
|
5634 |
kmem_free(buf, buflen); |
|
5635 |
return (error); |
|
5636 |
} |
|
5637 |
||
5638 |
static void |
|
5639 |
zone_release_door(door_handle_t *doorp) |
|
5640 |
{ |
|
5641 |
door_ki_rele(*doorp); |
|
5642 |
*doorp = NULL; |
|
5643 |
} |
|
5644 |
||
5645 |
static void |
|
5646 |
zone_ki_call_zoneadmd(struct zarg *zargp) |
|
5647 |
{ |
|
5648 |
door_handle_t door = NULL; |
|
5649 |
door_arg_t darg, save_arg; |
|
5650 |
char *zone_name; |
|
5651 |
size_t zone_namelen; |
|
5652 |
zoneid_t zoneid; |
|
5653 |
zone_t *zone; |
|
5654 |
zone_cmd_arg_t arg; |
|
5655 |
uint64_t uniqid; |
|
5656 |
size_t size; |
|
5657 |
int error; |
|
5658 |
int retry; |
|
5659 |
||
5660 |
zone = zargp->zone; |
|
5661 |
arg = zargp->arg; |
|
5662 |
kmem_free(zargp, sizeof (*zargp)); |
|
5663 |
||
5664 |
zone_namelen = strlen(zone->zone_name) + 1; |
|
5665 |
zone_name = kmem_alloc(zone_namelen, KM_SLEEP); |
|
5666 |
bcopy(zone->zone_name, zone_name, zone_namelen); |
|
5667 |
zoneid = zone->zone_id; |
|
5668 |
uniqid = zone->zone_uniqid; |
|
5669 |
/* |
|
5670 |
* zoneadmd may be down, but at least we can empty out the zone. |
|
5671 |
* We can ignore the return value of zone_empty() since we're called |
|
5672 |
* from a kernel thread and know we won't be delivered any signals. |
|
5673 |
*/ |
|
5674 |
ASSERT(curproc == &p0); |
|
5675 |
(void) zone_empty(zone); |
|
5676 |
ASSERT(zone_status_get(zone) >= ZONE_IS_EMPTY); |
|
5677 |
zone_rele(zone); |
|
5678 |
||
5679 |
size = sizeof (arg); |
|
5680 |
darg.rbuf = (char *)&arg; |
|
5681 |
darg.data_ptr = (char *)&arg; |
|
5682 |
darg.rsize = size; |
|
5683 |
darg.data_size = size; |
|
5684 |
darg.desc_ptr = NULL; |
|
5685 |
darg.desc_num = 0; |
|
5686 |
||
5687 |
save_arg = darg; |
|
5688 |
/* |
|
5689 |
* Since we're not holding a reference to the zone, any number of |
|
5690 |
* things can go wrong, including the zone disappearing before we get a |
|
5691 |
* chance to talk to zoneadmd. |
|
5692 |
*/ |
|
5693 |
for (retry = 0; /* forever */; retry++) { |
|
5694 |
if (door == NULL && |
|
5695 |
(error = zone_lookup_door(zone_name, &door)) != 0) { |
|
5696 |
goto next; |
|
5697 |
} |
|
5698 |
ASSERT(door != NULL); |
|
5699 |
||
6997
056043f166c6
PSARC 2008/208 Flexible Credentials and Result Limits for Kernel Door Upcalls
jwadams
parents:
6073
diff
changeset
|
5700 |
if ((error = door_ki_upcall_limited(door, &darg, NULL, |
056043f166c6
PSARC 2008/208 Flexible Credentials and Result Limits for Kernel Door Upcalls
jwadams
parents:
6073
diff
changeset
|
5701 |
SIZE_MAX, 0)) == 0) { |
0 | 5702 |
break; |
5703 |
} |
|
5704 |
switch (error) { |
|
5705 |
case EINTR: |
|
5706 |
/* FALLTHROUGH */ |
|
5707 |
case EAGAIN: /* process may be forking */ |
|
5708 |
/* |
|
5709 |
* Back off for a bit |
|
5710 |
*/ |
|
5711 |
break; |
|
5712 |
case EBADF: |
|
5713 |
zone_release_door(&door); |
|
5714 |
if (zone_lookup_door(zone_name, &door) != 0) { |
|
5715 |
/* |
|
5716 |
* zoneadmd may be dead, but it may come back to |
|
5717 |
* life later. |
|
5718 |
*/ |
|
5719 |
break; |
|
5720 |
} |
|
5721 |
break; |
|
5722 |
default: |
|
5723 |
cmn_err(CE_WARN, |
|
5724 |
"zone_ki_call_zoneadmd: door_ki_upcall error %d\n", |
|
5725 |
error); |
|
5726 |
goto out; |
|
5727 |
} |
|
5728 |
next: |
|
5729 |
/* |
|
5730 |
* If this isn't the same zone_t that we originally had in mind, |
|
5731 |
* then this is the same as if two kadmin requests come in at |
|
5732 |
* the same time: the first one wins. This means we lose, so we |
|
5733 |
* bail. |
|
5734 |
*/ |
|
5735 |
if ((zone = zone_find_by_id(zoneid)) == NULL) { |
|
5736 |
/* |
|
5737 |
* Problem is solved. |
|
5738 |
*/ |
|
5739 |
break; |
|
5740 |
} |
|
5741 |
if (zone->zone_uniqid != uniqid) { |
|
5742 |
/* |
|
5743 |
* zoneid recycled |
|
5744 |
*/ |
|
5745 |
zone_rele(zone); |
|
5746 |
break; |
|
5747 |
} |
|
5748 |
/* |
|
5749 |
* We could zone_status_timedwait(), but there doesn't seem to |
|
5750 |
* be much point in doing that (plus, it would mean that |
|
5751 |
* zone_free() isn't called until this thread exits). |
|
5752 |
*/ |
|
5753 |
zone_rele(zone); |
|
5754 |
delay(hz); |
|
5755 |
darg = save_arg; |
|
5756 |
} |
|
5757 |
out: |
|
5758 |
if (door != NULL) { |
|
5759 |
zone_release_door(&door); |
|
5760 |
} |
|
5761 |
kmem_free(zone_name, zone_namelen); |
|
5762 |
thread_exit(); |
|
5763 |
} |
|
5764 |
||
5765 |
/* |
|
2267 | 5766 |
* Entry point for uadmin() to tell the zone to go away or reboot. Analog to |
5767 |
* kadmin(). The caller is a process in the zone. |
|
0 | 5768 |
* |
5769 |
* In order to shutdown the zone, we will hand off control to zoneadmd |
|
5770 |
* (running in the global zone) via a door. We do a half-hearted job at |
|
5771 |
* killing all processes in the zone, create a kernel thread to contact |
|
5772 |
* zoneadmd, and make note of the "uniqid" of the zone. The uniqid is |
|
5773 |
* a form of generation number used to let zoneadmd (as well as |
|
5774 |
* zone_destroy()) know exactly which zone they're re talking about. |
|
5775 |
*/ |
|
5776 |
int |
|
2267 | 5777 |
zone_kadmin(int cmd, int fcn, const char *mdep, cred_t *credp) |
0 | 5778 |
{ |
5779 |
struct zarg *zargp; |
|
5780 |
zone_cmd_t zcmd; |
|
5781 |
zone_t *zone; |
|
5782 |
||
5783 |
zone = curproc->p_zone; |
|
5784 |
ASSERT(getzoneid() != GLOBAL_ZONEID); |
|
5785 |
||
5786 |
switch (cmd) { |
|
5787 |
case A_SHUTDOWN: |
|
5788 |
switch (fcn) { |
|
5789 |
case AD_HALT: |
|
5790 |
case AD_POWEROFF: |
|
5791 |
zcmd = Z_HALT; |
|
5792 |
break; |
|
5793 |
case AD_BOOT: |
|
5794 |
zcmd = Z_REBOOT; |
|
5795 |
break; |
|
5796 |
case AD_IBOOT: |
|
5797 |
case AD_SBOOT: |
|
5798 |
case AD_SIBOOT: |
|
5799 |
case AD_NOSYNC: |
|
5800 |
return (ENOTSUP); |
|
5801 |
default: |
|
5802 |
return (EINVAL); |
|
5803 |
} |
|
5804 |
break; |
|
5805 |
case A_REBOOT: |
|
5806 |
zcmd = Z_REBOOT; |
|
5807 |
break; |
|
5808 |
case A_FTRACE: |
|
5809 |
case A_REMOUNT: |
|
5810 |
case A_FREEZE: |
|
5811 |
case A_DUMP: |
|
9160
1517e6edbc6f
PSARC/2008/760 Boot configuration Service
Sherry Moore <Sherry.Moore@Sun.COM>
parents:
9121
diff
changeset
|
5812 |
case A_CONFIG: |
0 | 5813 |
return (ENOTSUP); |
5814 |
default: |
|
5815 |
ASSERT(cmd != A_SWAPCTL); /* handled by uadmin() */ |
|
5816 |
return (EINVAL); |
|
5817 |
} |
|
5818 |
||
5819 |
if (secpolicy_zone_admin(credp, B_FALSE)) |
|
5820 |
return (EPERM); |
|
5821 |
mutex_enter(&zone_status_lock); |
|
2267 | 5822 |
|
0 | 5823 |
/* |
5824 |
* zone_status can't be ZONE_IS_EMPTY or higher since curproc |
|
5825 |
* is in the zone. |
|
5826 |
*/ |
|
5827 |
ASSERT(zone_status_get(zone) < ZONE_IS_EMPTY); |
|
5828 |
if (zone_status_get(zone) > ZONE_IS_RUNNING) { |
|
5829 |
/* |
|
5830 |
* This zone is already on its way down. |
|
5831 |
*/ |
|
5832 |
mutex_exit(&zone_status_lock); |
|
5833 |
return (0); |
|
5834 |
} |
|
5835 |
/* |
|
5836 |
* Prevent future zone_enter()s |
|
5837 |
*/ |
|
5838 |
zone_status_set(zone, ZONE_IS_SHUTTING_DOWN); |
|
5839 |
mutex_exit(&zone_status_lock); |
|
5840 |
||
5841 |
/* |
|
5842 |
* Kill everyone now and call zoneadmd later. |
|
5843 |
* zone_ki_call_zoneadmd() will do a more thorough job of this |
|
5844 |
* later. |
|
5845 |
*/ |
|
5846 |
killall(zone->zone_id); |
|
5847 |
/* |
|
5848 |
* Now, create the thread to contact zoneadmd and do the rest of the |
|
5849 |
* work. This thread can't be created in our zone otherwise |
|
5850 |
* zone_destroy() would deadlock. |
|
5851 |
*/ |
|
2267 | 5852 |
zargp = kmem_zalloc(sizeof (*zargp), KM_SLEEP); |
0 | 5853 |
zargp->arg.cmd = zcmd; |
5854 |
zargp->arg.uniqid = zone->zone_uniqid; |
|
2267 | 5855 |
zargp->zone = zone; |
0 | 5856 |
(void) strcpy(zargp->arg.locale, "C"); |
2267 | 5857 |
/* mdep was already copied in for us by uadmin */ |
5858 |
if (mdep != NULL) |
|
5859 |
(void) strlcpy(zargp->arg.bootbuf, mdep, |
|
5860 |
sizeof (zargp->arg.bootbuf)); |
|
5861 |
zone_hold(zone); |
|
0 | 5862 |
|
5863 |
(void) thread_create(NULL, 0, zone_ki_call_zoneadmd, zargp, 0, &p0, |
|
5864 |
TS_RUN, minclsyspri); |
|
5865 |
exit(CLD_EXITED, 0); |
|
5866 |
||
5867 |
return (EINVAL); |
|
5868 |
} |
|
5869 |
||
5870 |
/* |
|
5871 |
* Entry point so kadmin(A_SHUTDOWN, ...) can set the global zone's |
|
5872 |
* status to ZONE_IS_SHUTTING_DOWN. |
|
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5873 |
* |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5874 |
* This function also shuts down all running zones to ensure that they won't |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5875 |
* fork new processes. |
0 | 5876 |
*/ |
5877 |
void |
|
5878 |
zone_shutdown_global(void) |
|
5879 |
{ |
|
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5880 |
zone_t *current_zonep; |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5881 |
|
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5882 |
ASSERT(INGLOBALZONE(curproc)); |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5883 |
mutex_enter(&zonehash_lock); |
0 | 5884 |
mutex_enter(&zone_status_lock); |
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5885 |
|
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5886 |
/* Modify the global zone's status first. */ |
0 | 5887 |
ASSERT(zone_status_get(global_zone) == ZONE_IS_RUNNING); |
5888 |
zone_status_set(global_zone, ZONE_IS_SHUTTING_DOWN); |
|
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5889 |
|
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5890 |
/* |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5891 |
* Now change the states of all running zones to ZONE_IS_SHUTTING_DOWN. |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5892 |
* We don't mark all zones with ZONE_IS_SHUTTING_DOWN because doing so |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5893 |
* could cause assertions to fail (e.g., assertions about a zone's |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5894 |
* state during initialization, readying, or booting) or produce races. |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5895 |
* We'll let threads continue to initialize and ready new zones: they'll |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5896 |
* fail to boot the new zones when they see that the global zone is |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5897 |
* shutting down. |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5898 |
*/ |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5899 |
for (current_zonep = list_head(&zone_active); current_zonep != NULL; |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5900 |
current_zonep = list_next(&zone_active, current_zonep)) { |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5901 |
if (zone_status_get(current_zonep) == ZONE_IS_RUNNING) |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5902 |
zone_status_set(current_zonep, ZONE_IS_SHUTTING_DOWN); |
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5903 |
} |
0 | 5904 |
mutex_exit(&zone_status_lock); |
8364
a7175cb7e760
5075745 madly forking zones can prevent the global zone from rebooting
jv227347 <Jordan.Vaughan@Sun.com>
parents:
6997
diff
changeset
|
5905 |
mutex_exit(&zonehash_lock); |
0 | 5906 |
} |
789 | 5907 |
|
5908 |
/* |
|
5909 |
* Returns true if the named dataset is visible in the current zone. |
|
5910 |
* The 'write' parameter is set to 1 if the dataset is also writable. |
|
5911 |
*/ |
|
5912 |
int |
|
5913 |
zone_dataset_visible(const char *dataset, int *write) |
|
5914 |
{ |
|
11850
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5915 |
static int zfstype = -1; |
789 | 5916 |
zone_dataset_t *zd; |
5917 |
size_t len; |
|
5918 |
zone_t *zone = curproc->p_zone; |
|
11850
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5919 |
const char *name = NULL; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5920 |
vfs_t *vfsp = NULL; |
789 | 5921 |
|
5922 |
if (dataset[0] == '\0') |
|
5923 |
return (0); |
|
5924 |
||
5925 |
/* |
|
5926 |
* Walk the list once, looking for datasets which match exactly, or |
|
5927 |
* specify a dataset underneath an exported dataset. If found, return |
|
5928 |
* true and note that it is writable. |
|
5929 |
*/ |
|
5930 |
for (zd = list_head(&zone->zone_datasets); zd != NULL; |
|
5931 |
zd = list_next(&zone->zone_datasets, zd)) { |
|
5932 |
||
5933 |
len = strlen(zd->zd_dataset); |
|
5934 |
if (strlen(dataset) >= len && |
|
5935 |
bcmp(dataset, zd->zd_dataset, len) == 0 && |
|
816
4a2d51f7b961
6344201 Assertion failed: err == 0 (0x1 == 0x0), file: ../../common/fs/zfs/zfs_ctldir.c, line: 659
maybee
parents:
813
diff
changeset
|
5936 |
(dataset[len] == '\0' || dataset[len] == '/' || |
4a2d51f7b961
6344201 Assertion failed: err == 0 (0x1 == 0x0), file: ../../common/fs/zfs/zfs_ctldir.c, line: 659
maybee
parents:
813
diff
changeset
|
5937 |
dataset[len] == '@')) { |
789 | 5938 |
if (write) |
5939 |
*write = 1; |
|
5940 |
return (1); |
|
5941 |
} |
|
5942 |
} |
|
5943 |
||
5944 |
/* |
|
5945 |
* Walk the list a second time, searching for datasets which are parents |
|
5946 |
* of exported datasets. These should be visible, but read-only. |
|
5947 |
* |
|
5948 |
* Note that we also have to support forms such as 'pool/dataset/', with |
|
5949 |
* a trailing slash. |
|
5950 |
*/ |
|
5951 |
for (zd = list_head(&zone->zone_datasets); zd != NULL; |
|
5952 |
zd = list_next(&zone->zone_datasets, zd)) { |
|
5953 |
||
5954 |
len = strlen(dataset); |
|
5955 |
if (dataset[len - 1] == '/') |
|
5956 |
len--; /* Ignore trailing slash */ |
|
5957 |
if (len < strlen(zd->zd_dataset) && |
|
5958 |
bcmp(dataset, zd->zd_dataset, len) == 0 && |
|
5959 |
zd->zd_dataset[len] == '/') { |
|
5960 |
if (write) |
|
5961 |
*write = 0; |
|
5962 |
return (1); |
|
5963 |
} |
|
5964 |
} |
|
5965 |
||
11850
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5966 |
/* |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5967 |
* We reach here if the given dataset is not found in the zone_dataset |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5968 |
* list. Check if this dataset was added as a filesystem (ie. "add fs") |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5969 |
* instead of delegation. For this we search for the dataset in the |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5970 |
* zone_vfslist of this zone. If found, return true and note that it is |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5971 |
* not writable. |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5972 |
*/ |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5973 |
|
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5974 |
/* |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5975 |
* Initialize zfstype if it is not initialized yet. |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5976 |
*/ |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5977 |
if (zfstype == -1) { |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5978 |
struct vfssw *vswp = vfs_getvfssw("zfs"); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5979 |
zfstype = vswp - vfssw; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5980 |
vfs_unrefvfssw(vswp); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5981 |
} |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5982 |
|
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5983 |
vfs_list_read_lock(); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5984 |
vfsp = zone->zone_vfslist; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5985 |
do { |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5986 |
ASSERT(vfsp); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5987 |
if (vfsp->vfs_fstype == zfstype) { |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5988 |
name = refstr_value(vfsp->vfs_resource); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5989 |
|
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5990 |
/* |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5991 |
* Check if we have an exact match. |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5992 |
*/ |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5993 |
if (strcmp(dataset, name) == 0) { |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5994 |
vfs_list_unlock(); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5995 |
if (write) |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5996 |
*write = 0; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5997 |
return (1); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5998 |
} |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
5999 |
/* |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6000 |
* We need to check if we are looking for parents of |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6001 |
* a dataset. These should be visible, but read-only. |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6002 |
*/ |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6003 |
len = strlen(dataset); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6004 |
if (dataset[len - 1] == '/') |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6005 |
len--; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6006 |
|
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6007 |
if (len < strlen(name) && |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6008 |
bcmp(dataset, name, len) == 0 && name[len] == '/') { |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6009 |
vfs_list_unlock(); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6010 |
if (write) |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6011 |
*write = 0; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6012 |
return (1); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6013 |
} |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6014 |
} |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6015 |
vfsp = vfsp->vfs_zone_next; |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6016 |
} while (vfsp != zone->zone_vfslist); |
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6017 |
|
e8fd9dbe1e8d
6826620 df shows a wrong results for a zfs dataset from a non-global zone
Sanjeev Bagewadi <Sanjeev.Bagewadi@Sun.COM>
parents:
11173
diff
changeset
|
6018 |
vfs_list_unlock(); |
789 | 6019 |
return (0); |
6020 |
} |
|
1676 | 6021 |
|
6022 |
/* |
|
6023 |
* zone_find_by_any_path() - |
|
6024 |
* |
|
6025 |
* kernel-private routine similar to zone_find_by_path(), but which |
|
6026 |
* effectively compares against zone paths rather than zonerootpath |
|
6027 |
* (i.e., the last component of zonerootpaths, which should be "root/", |
|
6028 |
* are not compared.) This is done in order to accurately identify all |
|
6029 |
* paths, whether zone-visible or not, including those which are parallel |
|
6030 |
* to /root/, such as /dev/, /home/, etc... |
|
6031 |
* |
|
6032 |
* If the specified path does not fall under any zone path then global |
|
6033 |
* zone is returned. |
|
6034 |
* |
|
6035 |
* The treat_abs parameter indicates whether the path should be treated as |
|
6036 |
* an absolute path although it does not begin with "/". (This supports |
|
6037 |
* nfs mount syntax such as host:any/path.) |
|
6038 |
* |
|
6039 |
* The caller is responsible for zone_rele of the returned zone. |
|
6040 |
*/ |
|
6041 |
zone_t * |
|
6042 |
zone_find_by_any_path(const char *path, boolean_t treat_abs) |
|
6043 |
{ |
|
6044 |
zone_t *zone; |
|
6045 |
int path_offset = 0; |
|
6046 |
||
6047 |
if (path == NULL) { |
|
6048 |
zone_hold(global_zone); |
|
6049 |
return (global_zone); |
|
6050 |
} |
|
6051 |
||
6052 |
if (*path != '/') { |
|
6053 |
ASSERT(treat_abs); |
|
6054 |
path_offset = 1; |
|
6055 |
} |
|
6056 |
||
6057 |
mutex_enter(&zonehash_lock); |
|
6058 |
for (zone = list_head(&zone_active); zone != NULL; |
|
6059 |
zone = list_next(&zone_active, zone)) { |
|
6060 |
char *c; |
|
6061 |
size_t pathlen; |
|
1876
1427ed2daa73
6414797 code in zone_find_by_any_path generates bad assembly code and panic, can be worked around w/ change
mp46848
parents:
1769
diff
changeset
|
6062 |
char *rootpath_start; |
1676 | 6063 |
|
6064 |
if (zone == global_zone) /* skip global zone */ |
|
6065 |
continue; |
|
6066 |
||
6067 |
/* scan backwards to find start of last component */ |
|
6068 |
c = zone->zone_rootpath + zone->zone_rootpathlen - 2; |
|
6069 |
do { |
|
6070 |
c--; |
|
6071 |
} while (*c != '/'); |
|
6072 |
||
1876
1427ed2daa73
6414797 code in zone_find_by_any_path generates bad assembly code and panic, can be worked around w/ change
mp46848
parents:
1769
diff
changeset
|
6073 |
pathlen = c - zone->zone_rootpath + 1 - path_offset; |
1427ed2daa73
6414797 code in zone_find_by_any_path generates bad assembly code and panic, can be worked around w/ change
mp46848
parents:
1769
diff
changeset
|
6074 |
rootpath_start = (zone->zone_rootpath + path_offset); |
1427ed2daa73
6414797 code in zone_find_by_any_path generates bad assembly code and panic, can be worked around w/ change
mp46848
parents:
1769
diff
changeset
|
6075 |
if (strncmp(path, rootpath_start, pathlen) == 0) |
1676 | 6076 |
break; |
6077 |
} |
|
6078 |
if (zone == NULL) |
|
6079 |
zone = global_zone; |
|
6080 |
zone_hold(zone); |
|
6081 |
mutex_exit(&zonehash_lock); |
|
6082 |
return (zone); |
|
6083 |
} |
|
3448 | 6084 |
|
6085 |
/* |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6086 |
* Finds a zone_dl_t with the given linkid in the given zone. Returns the |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6087 |
* zone_dl_t pointer if found, and NULL otherwise. |
3448 | 6088 |
*/ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6089 |
static zone_dl_t * |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6090 |
zone_find_dl(zone_t *zone, datalink_id_t linkid) |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6091 |
{ |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6092 |
zone_dl_t *zdl; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6093 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6094 |
ASSERT(mutex_owned(&zone->zone_lock)); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6095 |
for (zdl = list_head(&zone->zone_dl_list); zdl != NULL; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6096 |
zdl = list_next(&zone->zone_dl_list, zdl)) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6097 |
if (zdl->zdl_id == linkid) |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6098 |
break; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6099 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6100 |
return (zdl); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6101 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6102 |
|
3448 | 6103 |
static boolean_t |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6104 |
zone_dl_exists(zone_t *zone, datalink_id_t linkid) |
3448 | 6105 |
{ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6106 |
boolean_t exists; |
3448 | 6107 |
|
6108 |
mutex_enter(&zone->zone_lock); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6109 |
exists = (zone_find_dl(zone, linkid) != NULL); |
3448 | 6110 |
mutex_exit(&zone->zone_lock); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6111 |
return (exists); |
3448 | 6112 |
} |
6113 |
||
6114 |
/* |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6115 |
* Add an data link name for the zone. |
3448 | 6116 |
*/ |
6117 |
static int |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6118 |
zone_add_datalink(zoneid_t zoneid, datalink_id_t linkid) |
3448 | 6119 |
{ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6120 |
zone_dl_t *zdl; |
3448 | 6121 |
zone_t *zone; |
6122 |
zone_t *thiszone; |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6123 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6124 |
if ((thiszone = zone_find_by_id(zoneid)) == NULL) |
3448 | 6125 |
return (set_errno(ENXIO)); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6126 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6127 |
/* Verify that the datalink ID doesn't already belong to a zone. */ |
3448 | 6128 |
mutex_enter(&zonehash_lock); |
6129 |
for (zone = list_head(&zone_active); zone != NULL; |
|
6130 |
zone = list_next(&zone_active, zone)) { |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6131 |
if (zone_dl_exists(zone, linkid)) { |
3448 | 6132 |
mutex_exit(&zonehash_lock); |
6133 |
zone_rele(thiszone); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6134 |
return (set_errno((zone == thiszone) ? EEXIST : EPERM)); |
3448 | 6135 |
} |
6136 |
} |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6137 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6138 |
zdl = kmem_zalloc(sizeof (*zdl), KM_SLEEP); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6139 |
zdl->zdl_id = linkid; |
3448 | 6140 |
mutex_enter(&thiszone->zone_lock); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6141 |
list_insert_head(&thiszone->zone_dl_list, zdl); |
3448 | 6142 |
mutex_exit(&thiszone->zone_lock); |
6143 |
mutex_exit(&zonehash_lock); |
|
6144 |
zone_rele(thiszone); |
|
6145 |
return (0); |
|
6146 |
} |
|
6147 |
||
6148 |
static int |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6149 |
zone_remove_datalink(zoneid_t zoneid, datalink_id_t linkid) |
3448 | 6150 |
{ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6151 |
zone_dl_t *zdl; |
3448 | 6152 |
zone_t *zone; |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6153 |
int err = 0; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6154 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6155 |
if ((zone = zone_find_by_id(zoneid)) == NULL) |
3448 | 6156 |
return (set_errno(EINVAL)); |
6157 |
||
6158 |
mutex_enter(&zone->zone_lock); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6159 |
if ((zdl = zone_find_dl(zone, linkid)) == NULL) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6160 |
err = ENXIO; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6161 |
} else { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6162 |
list_remove(&zone->zone_dl_list, zdl); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6163 |
kmem_free(zdl, sizeof (zone_dl_t)); |
3448 | 6164 |
} |
6165 |
mutex_exit(&zone->zone_lock); |
|
6166 |
zone_rele(zone); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6167 |
return (err == 0 ? 0 : set_errno(err)); |
3448 | 6168 |
} |
6169 |
||
6170 |
/* |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6171 |
* Using the zoneidp as ALL_ZONES, we can lookup which zone has been assigned |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6172 |
* the linkid. Otherwise we just check if the specified zoneidp has been |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6173 |
* assigned the supplied linkid. |
3448 | 6174 |
*/ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6175 |
int |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6176 |
zone_check_datalink(zoneid_t *zoneidp, datalink_id_t linkid) |
3448 | 6177 |
{ |
6178 |
zone_t *zone; |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6179 |
int err = ENXIO; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6180 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6181 |
if (*zoneidp != ALL_ZONES) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6182 |
if ((zone = zone_find_by_id(*zoneidp)) != NULL) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6183 |
if (zone_dl_exists(zone, linkid)) |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6184 |
err = 0; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6185 |
zone_rele(zone); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6186 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6187 |
return (err); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6188 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6189 |
|
3448 | 6190 |
mutex_enter(&zonehash_lock); |
6191 |
for (zone = list_head(&zone_active); zone != NULL; |
|
6192 |
zone = list_next(&zone_active, zone)) { |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6193 |
if (zone_dl_exists(zone, linkid)) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6194 |
*zoneidp = zone->zone_id; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6195 |
err = 0; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6196 |
break; |
3448 | 6197 |
} |
6198 |
} |
|
6199 |
mutex_exit(&zonehash_lock); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6200 |
return (err); |
3448 | 6201 |
} |
6202 |
||
6203 |
/* |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6204 |
* Get the list of datalink IDs assigned to a zone. |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6205 |
* |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6206 |
* On input, *nump is the number of datalink IDs that can fit in the supplied |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6207 |
* idarray. Upon return, *nump is either set to the number of datalink IDs |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6208 |
* that were placed in the array if the array was large enough, or to the |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6209 |
* number of datalink IDs that the function needs to place in the array if the |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6210 |
* array is too small. |
3448 | 6211 |
*/ |
6212 |
static int |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6213 |
zone_list_datalink(zoneid_t zoneid, int *nump, datalink_id_t *idarray) |
3448 | 6214 |
{ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6215 |
uint_t num, dlcount; |
3448 | 6216 |
zone_t *zone; |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6217 |
zone_dl_t *zdl; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6218 |
datalink_id_t *idptr = idarray; |
3448 | 6219 |
|
6220 |
if (copyin(nump, &dlcount, sizeof (dlcount)) != 0) |
|
6221 |
return (set_errno(EFAULT)); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6222 |
if ((zone = zone_find_by_id(zoneid)) == NULL) |
3448 | 6223 |
return (set_errno(ENXIO)); |
6224 |
||
6225 |
num = 0; |
|
6226 |
mutex_enter(&zone->zone_lock); |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6227 |
for (zdl = list_head(&zone->zone_dl_list); zdl != NULL; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6228 |
zdl = list_next(&zone->zone_dl_list, zdl)) { |
3448 | 6229 |
/* |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6230 |
* If the list is bigger than what the caller supplied, just |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6231 |
* count, don't do copyout. |
3448 | 6232 |
*/ |
6233 |
if (++num > dlcount) |
|
6234 |
continue; |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6235 |
if (copyout(&zdl->zdl_id, idptr, sizeof (*idptr)) != 0) { |
3448 | 6236 |
mutex_exit(&zone->zone_lock); |
6237 |
zone_rele(zone); |
|
6238 |
return (set_errno(EFAULT)); |
|
6239 |
} |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6240 |
idptr++; |
3448 | 6241 |
} |
6242 |
mutex_exit(&zone->zone_lock); |
|
6243 |
zone_rele(zone); |
|
6244 |
||
6245 |
/* Increased or decreased, caller should be notified. */ |
|
6246 |
if (num != dlcount) { |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6247 |
if (copyout(&num, nump, sizeof (num)) != 0) |
3448 | 6248 |
return (set_errno(EFAULT)); |
6249 |
} |
|
6250 |
return (0); |
|
6251 |
} |
|
6252 |
||
6253 |
/* |
|
6254 |
* Public interface for looking up a zone by zoneid. It's a customized version |
|
5880 | 6255 |
* for netstack_zone_create(). It can only be called from the zsd create |
6256 |
* callbacks, since it doesn't have reference on the zone structure hence if |
|
6257 |
* it is called elsewhere the zone could disappear after the zonehash_lock |
|
6258 |
* is dropped. |
|
6259 |
* |
|
6260 |
* Furthermore it |
|
6261 |
* 1. Doesn't check the status of the zone. |
|
6262 |
* 2. It will be called even before zone_init is called, in that case the |
|
3448 | 6263 |
* address of zone0 is returned directly, and netstack_zone_create() |
6264 |
* will only assign a value to zone0.zone_netstack, won't break anything. |
|
5880 | 6265 |
* 3. Returns without the zone being held. |
3448 | 6266 |
*/ |
6267 |
zone_t * |
|
6268 |
zone_find_by_id_nolock(zoneid_t zoneid) |
|
6269 |
{ |
|
5880 | 6270 |
zone_t *zone; |
6271 |
||
6272 |
mutex_enter(&zonehash_lock); |
|
3448 | 6273 |
if (zonehashbyid == NULL) |
5880 | 6274 |
zone = &zone0; |
3448 | 6275 |
else |
5880 | 6276 |
zone = zone_find_all_by_id(zoneid); |
6277 |
mutex_exit(&zonehash_lock); |
|
6278 |
return (zone); |
|
3448 | 6279 |
} |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6280 |
|
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6281 |
/* |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6282 |
* Walk the datalinks for a given zone |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6283 |
*/ |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6284 |
int |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6285 |
zone_datalink_walk(zoneid_t zoneid, int (*cb)(datalink_id_t, void *), |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6286 |
void *data) |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6287 |
{ |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6288 |
zone_t *zone; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6289 |
zone_dl_t *zdl; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6290 |
datalink_id_t *idarray; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6291 |
uint_t idcount = 0; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6292 |
int i, ret = 0; |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6293 |
|
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6294 |
if ((zone = zone_find_by_id(zoneid)) == NULL) |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6295 |
return (ENOENT); |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6296 |
|
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6297 |
/* |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6298 |
* We first build an array of linkid's so that we can walk these and |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6299 |
* execute the callback with the zone_lock dropped. |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6300 |
*/ |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6301 |
mutex_enter(&zone->zone_lock); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6302 |
for (zdl = list_head(&zone->zone_dl_list); zdl != NULL; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6303 |
zdl = list_next(&zone->zone_dl_list, zdl)) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6304 |
idcount++; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6305 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6306 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6307 |
if (idcount == 0) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6308 |
mutex_exit(&zone->zone_lock); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6309 |
zone_rele(zone); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6310 |
return (0); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6311 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6312 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6313 |
idarray = kmem_alloc(sizeof (datalink_id_t) * idcount, KM_NOSLEEP); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6314 |
if (idarray == NULL) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6315 |
mutex_exit(&zone->zone_lock); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6316 |
zone_rele(zone); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6317 |
return (ENOMEM); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6318 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6319 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6320 |
for (i = 0, zdl = list_head(&zone->zone_dl_list); zdl != NULL; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6321 |
i++, zdl = list_next(&zone->zone_dl_list, zdl)) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6322 |
idarray[i] = zdl->zdl_id; |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6323 |
} |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6324 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6325 |
mutex_exit(&zone->zone_lock); |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6326 |
|
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6327 |
for (i = 0; i < idcount && ret == 0; i++) { |
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6328 |
if ((ret = (*cb)(idarray[i], data)) != 0) |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6329 |
break; |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6330 |
} |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6331 |
|
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6332 |
zone_rele(zone); |
10616
3be00c4a6835
PSARC 2009/373 Clearview IP Tunneling
Sebastien Roy <Sebastien.Roy@Sun.COM>
parents:
9160
diff
changeset
|
6333 |
kmem_free(idarray, sizeof (datalink_id_t) * idcount); |
5895
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6334 |
return (ret); |
f251acdd9bdc
PSARC/2006/499 Clearview Nemo unification and vanity naming
yz147064
parents:
5880
diff
changeset
|
6335 |
} |