components/openstack/swift/patches/CVE-2016-0738.patch
author david.comay@oracle.com
Thu, 11 Feb 2016 16:42:18 -0800
changeset 5445 9b7b887c6955
permissions -rw-r--r--
22575858 problem in SERVICE/SWIFT

This upstream patch addresses CVE-2016-0738 which is filed under
Launchpad bug 1493303. The issue is addressed in Swift 2.3.1 as well as
2.5.1 or later.

From a4c1825a026655b7ed21d779824ae7c25318fd52 Mon Sep 17 00:00:00 2001
From: Samuel Merritt <[email protected]>
Date: Tue, 8 Dec 2015 16:36:05 -0800
Subject: Fix memory/socket leak in proxy on truncated SLO/DLO GET

When a client disconnected while consuming an SLO or DLO GET response,
the proxy would leak a socket. This could be observed via strace as a
socket that had shutdown() called on it, but was never closed. It
could also be observed by counting entries in /proc/<pid>/fd, where
<pid> is the pid of a proxy server worker process.

This is due to a memory leak in SegmentedIterable. A SegmentedIterable
has an 'app_iter' attribute, which is a generator. That generator
references 'self' (the SegmentedIterable object). This creates a
cyclic reference: the generator refers to the SegmentedIterable, and
the SegmentedIterable refers to the generator.

Python can normally handle cyclic garbage; reference counting won't
reclaim it, but the garbage collector will. However, objects with
finalizers will stop the garbage collector from collecting them* and
the cycle of which they are part.

For most objects, "has finalizer" is synonymous with "has a __del__
method". However, a generator has a finalizer once it's started
running and before it finishes: basically, while it has stack frames
associated with it**.

When a client disconnects mid-stream, we get a memory leak. We have
our SegmentedIterable object (call it "si"), and its associated
generator. si.app_iter is the generator, and the generator closes over
si, so we have a cycle; and the generator has started but not yet
finished, so the generator needs finalization; hence, the garbage
collector won't ever clean it up.

The socket leak comes in because the generator *also* refers to the
request's WSGI environment, which contains wsgi.input, which
ultimately refers to a _socket object from the standard
library. Python's _socket objects only close their underlying file
descriptor when their reference counts fall to 0***.

This commit makes SegmentedIterable.close() call
self.app_iter.close(), thereby unwinding its generator's stack and
making it eligible for garbage collection.

* in Python < 3.4, at least. See PEP 442.

** see PyGen_NeedsFinalizing() in Objects/genobject.c and also
   has_finalizer() in Modules/gcmodule.c in Python.

*** see sock_dealloc() in Modules/socketmodule.c in Python. See
    sock_close() in the same file for the other half of the sad story.

This closes CVE-2016-0738.

Closes-Bug: 1493303

Change-Id: I9b617bfc152dca40d1750131d1d814d85c0a88dd
Co-Authored-By: Kota Tsuyuzaki <[email protected]>
---
 swift/common/request_helpers.py         |  6 ++--
 test/unit/common/middleware/test_slo.py | 62 +++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/swift/common/request_helpers.py b/swift/common/request_helpers.py
index 8aa8457..611ee83 100644
--- a/swift/common/request_helpers.py
+++ b/swift/common/request_helpers.py
@@ -378,6 +378,9 @@ class SegmentedIterable(object):
             self.logger.exception(_('ERROR: An error occurred '
                                     'while retrieving segments'))
             raise
+        finally:
+            if self.current_resp:
+                close_if_possible(self.current_resp.app_iter)
 
     def app_iter_range(self, *a, **kw):
         """
@@ -420,5 +423,4 @@ class SegmentedIterable(object):
         Called when the client disconnect. Ensure that the connection to the
         backend server is closed.
         """
-        if self.current_resp:
-            close_if_possible(self.current_resp.app_iter)
+        close_if_possible(self.app_iter)
diff --git a/test/unit/common/middleware/test_slo.py b/test/unit/common/middleware/test_slo.py
index 4d483c8..8119b5f 100644
--- a/test/unit/common/middleware/test_slo.py
+++ b/test/unit/common/middleware/test_slo.py
@@ -1253,6 +1253,68 @@ class TestSloGetManifest(SloTestCase):
         self.assertEqual(headers['X-Object-Meta-Fish'], 'Bass')
         self.assertEqual(body, '')
 
+    def test_generator_closure(self):
+        # Test that the SLO WSGI iterable closes its internal .app_iter when
+        # it receives a close() message.
+        #
+        # This is sufficient to fix a memory leak. The memory leak arises
+        # due to cyclic references involving a running generator; a running
+        # generator sometimes preventes the GC from collecting it in the
+        # same way that an object with a defined __del__ does.
+        #
+        # There are other ways to break the cycle and fix the memory leak as
+        # well; calling .close() on the generator is sufficient, but not
+        # necessary. However, having this test is better than nothing for
+        # preventing regressions.
+        leaks = [0]
+
+        class LeakTracker(object):
+            def __init__(self, inner_iter):
+                leaks[0] += 1
+                self.inner_iter = iter(inner_iter)
+
+            def __iter__(self):
+                return self
+
+            def next(self):
+                return next(self.inner_iter)
+
+            def close(self):
+                leaks[0] -= 1
+                self.inner_iter.close()
+
+        class LeakTrackingSegmentedIterable(slo.SegmentedIterable):
+            def _internal_iter(self, *a, **kw):
+                it = super(
+                    LeakTrackingSegmentedIterable, self)._internal_iter(
+                        *a, **kw)
+                return LeakTracker(it)
+
+        status = [None]
+        headers = [None]
+
+        def start_response(s, h, ei=None):
+            status[0] = s
+            headers[0] = h
+
+        req = Request.blank(
+            '/v1/AUTH_test/gettest/manifest-abcd',
+            environ={'REQUEST_METHOD': 'GET',
+                     'HTTP_ACCEPT': 'application/json'})
+
+        # can't self.call_slo() here since we don't want to consume the
+        # whole body
+        with patch.object(slo, 'SegmentedIterable',
+                          LeakTrackingSegmentedIterable):
+            app_resp = self.slo(req.environ, start_response)
+        self.assertEqual(status[0], '200 OK')  # sanity check
+        body_iter = iter(app_resp)
+        chunk = next(body_iter)
+        self.assertEqual(chunk, 'aaaaa')  # sanity check
+
+        app_resp.close()
+        self.assertEqual(0, leaks[0])
+
     def test_head_manifest_is_efficient(self):
         req = Request.blank(
             '/v1/AUTH_test/gettest/manifest-abcd',
-- 
cgit v0.11.2