[Pulp-dev] Memory consumption on RPM sync

Daniel Alley dalley at redhat.com
Tue Nov 26 18:26:14 UTC 2019


Fabricio, this is great work!

One thing that stands out is that a very large amount of time is being
spent in remove_duplicates(), 65% of the total runtime of the sync.

   - 13% of the total runtime spent on this inside cast()
   https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/repo_version_utils.py#L31
   - 31% of the total runtime spent in these lines
   https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/repo_version_utils.py#L37-L38
   - 20% of the total runtime spent on this line
   https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/repo_version_utils.py#L42

There's a couple of suggestions on how to improve this code written up here
https://pulp.plan.io/issues/5701

The memory usage seems to be coming from the Django ORM backend.  My guess
is that lines 37 & 38 (linked above) are generating some extremely long
SQL, because it's basically doing OR with a vast number of individual
queries. And I think that is also not great for performance.
https://www.cybertec-postgresql.com/en/avoid-or-for-better-performance/

Another hint that that might be the case is that the memory usage grows and
grows and then collapses all at once.  That would make sense if the SQL
compiler's data structure is growing and growing and then being destroyed
after the query executes.

I would imagine that if we fix our queries, we'll both fix the memory
consumption and improve performance at the same time.

On Tue, Nov 26, 2019 at 11:57 AM Fabricio Aguiar <fabricio.aguiar at redhat.com>
wrote:

> Hi everyone,
>
> I've been investigating memory consumption for syncing on RPM plugin - issue
> 5688 <https://pulp.plan.io/issues/5688>, and I want to share what I've
> found so far.
>
> First, it seems that a new process is created when we started to sync, and
> this new process is responsible for the large amount of memory consumption
> observed.
>
> As I shared on some notes here: https://pulp.plan.io/issues/5688
> The problem seems to affect only RHEL, I did not see a large consumption
> of memory for the following repos:
> - CentOS 6: http://mirror.centos.org/centos-6/6.10/os/x86_64/
> - CentOS 7: http://mirror.linux.duke.edu/pub/centos/7/os/x86_64/
> - CentOS 8 AppStrem:
> http://mirror.linux.duke.edu/pub/centos/8/AppStream/x86_64/os/
> - CentOS 8 BaseOS:
> http://mirror.linux.duke.edu/pub/centos/8/BaseOS/x86_64/os/
> - EPEL 7: https://dl.fedoraproject.org/pub/epel/7/x86_64/
>
> As stated on the issue, with RHEL 7 we can observe the problem, for my
> tests, I used the following repo:
> http://cdn.stage.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/
>
> For my tests, I used pulp3-source-fedora30 vagrant box, initially set with
> 4GB, then I increased to 16GB, and it was not enough. I only was able to
> sync RHEL 7 when I increased the memory up to 23GB.
>
> Utilizing py-spy I got those SVG profiles attached, you can find them
> hosted here:
> Sync - https://sendeyo.com/up/d/90a8ae4c8f
> Re-sync - https://sendeyo.com/up/d/4c855bcce3
>
> With the following branch, I was able to take some tracemalloc
> <https://docs.python.org/3/library/tracemalloc.html> snapshots (every 30
> seconds):
>
> https://github.com/pulp/pulp_rpm/commit/f3f079010cfe81b7f5cf3ef94c2402b1ccf7d90c
>
> *Tracemalloc for Sync at peak of memory consumption:*
>
> #1: sql/compiler.py:1512: 7825340.4 KiB
> #2: tasks/synchronizing.py:154: 4062651.9 KiB
> #3: stages/declarative_version.py:149: 4062601.9 KiB
> #4: models/repository.py:631: 4062130.0 KiB
> #5: models/repository.py:96: 4062127.7 KiB
> #6: rq/job.py:611: 4060459.2 KiB
> #7: rq/job.py:605: 4058273.8 KiB
> #8: rq/worker.py:822: 4053883.3 KiB
> #9: tasking/worker.py:100: 4053875.9 KiB
> #10: rq/worker.py:684: 4053849.9 KiB
> #11: rq/worker.py:610: 4051645.7 KiB
> #12: plugin/repo_version_utils.py:31: 4028192.9 KiB
> #13: models/base.py:124: 4028191.2 KiB
> #14: fields/related_descriptors.py:401: 4028165.2 KiB
> #15: models/query.py:1242: 4021212.9 KiB
> #16: models/query.py:402: 3995284.5 KiB
> #17: models/query.py:256: 3995284.5 KiB
> #18: models/query.py:55: 3952718.2 KiB
> #19: sql/compiler.py:1133: 3919417.7 KiB
> #20: db/utils.py:96: 3912674.4 KiB
> #21: psycopg2/_json.py:166: 3837534.7 KiB
> #22: json/decoder.py:337: 3837533.2 KiB
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
> #23: json/__init__.py:348: 3837533.2 KiB
>     return _default_decoder.decode(s)
> #24: json/decoder.py:353: 3837532.7 KiB
>     obj, end = self.scan_once(s, idx)
> #25: rq/worker.py:670: 211891.5 KiB
> #26: tasking/worker.py:69: 198267.0 KiB
> #27: rq/worker.py:477: 191261.5 KiB
> #28: cli/cli.py:252: 186110.9 KiB
> #29: click/core.py:555: 185957.8 KiB
> #30: cli/cli.py:78: 104822.7 KiB
> #31: models/query.py:73: 48350.9 KiB
> #32: models/base.py:513: 48350.9 KiB
> #33: sql/compiler.py:405: 38126.1 KiB
> #34: sql/compiler.py:1087: 33355.6 KiB
> #35: click/core.py:956: 29580.3 KiB
> #36: click/core.py:1137: 29498.3 KiB
> #37: models/base.py:430: 29340.9 KiB
> #38: models/query.py:274: 25946.8 KiB
> #39: plugin/repo_version_utils.py:30: 25850.4 KiB
> #40: models/query.py:892: 24105.6 KiB
> #41: models/query.py:910: 24089.7 KiB
> #42: models/query.py:399: 24045.0 KiB
> #43: sql/query.py:1290: 17343.9 KiB
> #44: sql/where.py:81: 15629.7 KiB
> #45: sql/compiler.py:489: 15581.6 KiB
> #46: models/lookups.py:162: 15527.0 KiB
> #47: psycopg2/extras.py:678: 15027.3 KiB
> #48: sql/query.py:1323: 13486.4 KiB
> #49: sql/query.py:796: 13486.3 KiB
> #50: sql/compiler.py:474: 11291.3 KiB
> #51: sql/compiler.py:54: 11282.7 KiB
> #52: sql/compiler.py:45: 11282.7 KiB
> #53: click/core.py:717: 10190.6 KiB
> #54: models/query.py:72: 9967.2 KiB
> #55: models/query.py:1219: 8855.6 KiB
> #56: models/query.py:1231: 8855.6 KiB
> #57: models/query.py:1072: 8840.0 KiB
> #58: models/query.py:401: 8840.0 KiB
> #59: models/lookups.py:153: 8779.2 KiB
> #60: models/lookups.py:79: 8773.0 KiB
> #61: models/expressions.py:332: 8769.2 KiB
> #62: models/expressions.py:238: 8769.2 KiB
> #63: plugin/repo_version_utils.py:37: 7602.2 KiB
> #64: models/base.py:408: 7378.9 KiB
> #65: click/core.py:764: 7371.2 KiB
> #66: sql/compiler.py:254: 6853.7 KiB
> #67: models/expressions.py:737: 6851.8 KiB
> #68: models/lookups.py:159: 6745.8 KiB
> #69: sql/query.py:1293: 6744.2 KiB
> #70: sql/query.py:2258: 6743.6 KiB
> #71: sql/query.py:763: 6743.6 KiB
> #72: sql/query.py:2259: 6742.7 KiB
> #73: sql/compiler.py:1054: 6696.5 KiB
> #74: utils/deconstruct.py:17: 6261.2 KiB
> #75: fields/related.py:986: 6246.5 KiB
> #76: fields/__init__.py:381: 6245.8 KiB
> #77: models/base.py:411: 5990.2 KiB
> #78: python3.7/uuid.py:204: 5876.0 KiB
>     self.__dict__['int'] = int
> #79: sql/compiler.py:472: 5784.0 KiB
> #80: sql/compiler.py:1019: 5023.7 KiB
> #81: sql/compiler.py:1053: 4916.2 KiB
> #82: sql/query.py:350: 4490.9 KiB
> #83: sql/query.py:309: 4438.3 KiB
> #84: sql/compiler.py:219: 4406.5 KiB
> #85: sql/compiler.py:666: 4401.8 KiB
> #86: python3.7/copy.py:96: 4399.3 KiB
>     rv = reductor(4)
> #87: models/expressions.py:159: 4398.6 KiB
> #88: python3.7/copyreg.py:88: 4372.9 KiB
>     return cls.__new__(cls, *args)
> #89: python3.7/copy.py:274: 4372.9 KiB
>     y = func(*args)
> #90: python3.7/copy.py:106: 4372.8 KiB
>     return _reconstruct(x, None, *rv)
> #91: models/query_utils.py:82: 4223.2 KiB
> #92: bin/rq:10: 4115.9 KiB
> #93: python3.7/uuid.py:161: 4025.9 KiB
>     int = int_(hex, 16)
> #94: sql/query.py:1318: 3846.5 KiB
> #95: models/expressions.py:749: 3362.8 KiB
> #96: fields/__init__.py:709: 3340.3 KiB
> #97: utils/tree.py:108: 3008.9 KiB
> #98: models/query_utils.py:72: 2534.5 KiB
> #99: models/query_utils.py:85: 2534.2 KiB
> #100: models/base.py:503: 2214.3 KiB
> #101: models/query.py:199: 2190.8 KiB
> #102: models/query.py:190: 2178.6 KiB
> #103: models/query_utils.py:74: 2164.6 KiB
> #104: utils/tree.py:23: 2115.5 KiB
> #105: models/query_utils.py:59: 2113.1 KiB
> #106: sql/query.py:1251: 1895.8 KiB
> #107: sql/query.py:1116: 1895.3 KiB
> #108: models/lookups.py:19: 1892.1 KiB
> #109: sql/query.py:68: 1882.4 KiB
> #110: sql/query.py:1249: 1881.4 KiB
> #111: fields/related.py:974: 1679.4 KiB
> #112: sql/compiler.py:1018: 1672.8 KiB
> #113: base/operations.py:564: 1672.8 KiB
> #114: models/expressions.py:747: 1660.9 KiB
> #115: models/query.py:1893: 1590.1 KiB
> #116: models/query.py:63: 1590.1 KiB
> #117: models/query.py:62: 1554.4 KiB
> #118: models/query.py:61: 1554.4 KiB
> 962 other: 9.8 MiB
> Total allocated size: 99186.9 MiB
>
>
>
> *Tracemalloc for Re-sync at peak of memory consumption:*
>
> #1: sql/compiler.py:1512: 1813416.3 KiB
> #2: asyncio/events.py:88: 1159219.2 KiB
>     self._context.run(self._callback, *self._args)
> #3: stages/api.py:43: 1159151.2 KiB
> #4: asyncio/base_events.py:1775: 1158553.1 KiB
>     if bool(enabled) == bool(self._coroutine_origin_tracking_enabled):
> #5: asyncio/base_events.py:539: 1157867.4 KiB
>     self._thread_id = None
> #6: asyncio/base_events.py:571: 1156528.6 KiB
>     # local task.
> #7: stages/declarative_version.py:149: 1152654.5 KiB
> #8: tasks/synchronizing.py:154: 1151586.8 KiB
> #9: rq/job.py:611: 1147795.3 KiB
> #10: rq/job.py:605: 1147534.7 KiB
> #11: rq/worker.py:822: 1141556.0 KiB
> #12: tasking/worker.py:100: 1139872.3 KiB
> #13: rq/worker.py:684: 1137457.4 KiB
> #14: rq/worker.py:610: 1136707.1 KiB
> #15: models/query.py:274: 940312.1 KiB
> #16: models/query.py:1242: 921581.8 KiB
> #17: models/query.py:55: 908503.7 KiB
> #18: sql/compiler.py:1133: 906747.3 KiB
> #19: db/utils.py:96: 906707.8 KiB
> #20: stages/content_stages.py:48: 899314.9 KiB
> #21: psycopg2/_json.py:166: 872466.2 KiB
> #22: json/decoder.py:337: 871817.4 KiB
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
> #23: json/__init__.py:348: 871817.4 KiB
>     return _default_decoder.decode(s)
> #24: json/decoder.py:353: 871816.8 KiB
>     obj, end = self.scan_once(s, idx)
> #25: click/core.py:555: 448794.2 KiB
> #26: rq/worker.py:670: 259351.8 KiB
> #27: tasking/worker.py:69: 257743.4 KiB
> #28: rq/worker.py:477: 249243.3 KiB
> #29: cli/cli.py:252: 244023.2 KiB
> #30: cli/cli.py:78: 226347.5 KiB
> #31: click/core.py:956: 220086.5 KiB
> #32: click/core.py:1137: 213146.8 KiB
> #33: click/core.py:717: 188052.0 KiB
> #34: tasks/synchronizing.py:581: 177471.0 KiB
> #35: models/package.py:232: 138350.9 KiB
> #36: stages/artifact_stages.py:219: 54726.5 KiB
> #37: stages/artifact_stages.py:251: 54723.6 KiB
> #38: models/query.py:1625: 47373.9 KiB
> #39: models/query.py:1738: 46489.3 KiB
> #40: fields/related_descriptors.py:627: 42352.2 KiB
> #41: models/package.py:238: 24673.4 KiB
> #42: models/query.py:1591: 24325.2 KiB
> #43: models/query.py:1244: 18744.3 KiB
> #44: models/query.py:771: 18744.3 KiB
> #45: models/query.py:73: 12827.6 KiB
> #46: models/base.py:513: 12827.6 KiB
> #47: sql/compiler.py:405: 11912.8 KiB
> #48: psycopg2/extras.py:678: 11666.5 KiB
> #49: models/package.py:245: 10837.2 KiB
> #50: sql/compiler.py:1087: 7407.8 KiB
> #51: sql/where.py:81: 6915.5 KiB
> #52: models/base.py:741: 6705.2 KiB
> #53: models/base.py:779: 6697.1 KiB
> #54: models/progress.py:144: 6607.2 KiB
> #55: click/core.py:764: 6603.1 KiB
> #56: models/progress.py:191: 6601.5 KiB
> #57: tasks/synchronizing.py:609: 6574.8 KiB
> #58: python3.7/uuid.py:204: 6532.6 KiB
>     self.__dict__['int'] = int
> #59: models/base.py:851: 6485.3 KiB
> #60: models/base.py:411: 6366.6 KiB
> #61: models/base.py:900: 6178.2 KiB
> #62: models/query.py:760: 5612.1 KiB
> #63: sql/compiler.py:1429: 5528.5 KiB
> #64: models/base.py:408: 5042.7 KiB
> #65: tasks/synchronizing.py:575: 4724.0 KiB
> #66: createrepo_c/__init__.py:312: 4722.6 KiB
> #67: tasks/synchronizing.py:285: 4722.6 KiB
> #68: models/lookups.py:162: 4583.4 KiB
> #69: tasks/synchronizing.py:582: 4236.2 KiB
> #70: tasks/synchronizing.py:256: 3655.1 KiB
> #71: sql/query.py:1312: 3358.8 KiB
> #72: sql/compiler.py:1417: 3312.9 KiB
> #73: models/query.py:892: 3121.7 KiB
> #74: models/lookups.py:153: 2934.4 KiB
> #75: models/lookups.py:79: 2931.5 KiB
> #76: models/expressions.py:332: 2918.8 KiB
> #77: models/expressions.py:238: 2918.5 KiB
> #78: fields/related_descriptors.py:219: 2766.3 KiB
> #79: fields/related_descriptors.py:629: 2765.8 KiB
> #80: models/query.py:910: 2575.7 KiB
> #81: sql/query.py:1290: 2513.6 KiB
> #82: sql/query.py:1318: 2257.2 KiB
> #83: stages/association_stages.py:36: 2219.3 KiB
> #84: sql/compiler.py:1452: 2214.1 KiB
> #85: sql/compiler.py:1370: 2214.1 KiB
> #86: python3.7/uuid.py:161: 2179.2 KiB
>     int = int_(hex, 16)
> #87: models/base.py:503: 2082.6 KiB
> #88: models/manager.py:82: 2008.6 KiB
> #89: tasks/synchronizing.py:585: 1803.4 KiB
> #90: fields/related_descriptors.py:527: 1767.5 KiB
> #91: python3.7/copy.py:96: 1754.9 KiB
>     rv = reductor(4)
> #92: urllib/parse.py:562: 1725.2 KiB
>     resolved_path.append('')
> #93: urllib/parse.py:489: 1725.2 KiB
>     if url and url[:1] != '/': url = '/' + url
> #94: urllib/parse.py:475: 1725.2 KiB
>     _coerce_args(*components))
> #95: models/lookups.py:159: 1647.3 KiB
> #96: models/expressions.py:159: 1625.9 KiB
> #97: tasks/synchronizing.py:512: 1619.8 KiB
> #98: models/query.py:1652: 1619.0 KiB
> #99: fields/__init__.py:801: 1604.3 KiB
> #100: models/base.py:473: 1545.2 KiB
> #101: python3.7/uuid.py:759: 1488.4 KiB
>     def uuid4():
> #102: models/package.py:248: 1455.4 KiB
> #103: sql/compiler.py:489: 1332.9 KiB
> #104: python3.7/copy.py:106: 1293.2 KiB
>     return _reconstruct(x, None, *rv)
> #105: python3.7/copyreg.py:88: 1279.1 KiB
>     return cls.__new__(cls, *args)
> #106: python3.7/copy.py:274: 1279.1 KiB
>     y = func(*args)
> #107: sql/query.py:350: 1119.1 KiB
> #108: models/query.py:1231: 1057.4 KiB
> #109: models/query.py:1219: 1057.4 KiB
> 1019 other: 34.8 MiB
> Total allocated size: 28304.4 MiB
>
>
>
> The snapshots can be found here:
>
>  memory_data
> <https://drive.google.com/drive/folders/1mdJtdGDvbwq_jLc5m_T8VBjjb-bDRDmS>
>
> *Memory consumption over time:*
> [image: sync.png]
>
> [image: resync.png]
>
> [image: sync_and_resync.png]
>
> Best regards,
> Fabricio Aguiar
> Software Engineer, Pulp Project
> Red Hat Brazil - Latam <https://www.redhat.com/>
> +55 11 999652368
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191126/006e3e92/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sync.png
Type: image/png
Size: 20783 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191126/006e3e92/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: resync.png
Type: image/png
Size: 18626 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191126/006e3e92/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sync_and_resync.png
Type: image/png
Size: 29043 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20191126/006e3e92/attachment-0002.png>


More information about the Pulp-dev mailing list