[Pulp-list] Download analytics from CDN?
daviddavis at redhat.com
Tue Apr 6 19:06:36 UTC 2021
I don't know much about AWS logging but Pulp does set the filename in the
response-content-disposition. Could that be used to determine the
filename for each request?
If not, I'm looking at the boto3 docs for get_object to see if there's
another parameter we could set to help you track the filename in requests
but I'm seeing anything useful. My knowledge of s3 is a bit limited so if
you have a suggestion how we can construct a request to S3 that would help
you to track the filenames of requests to s3, I could probably look at how
we could support it in Pulp 3.
On Tue, Mar 30, 2021 at 10:43 AM Danny Sauer <danny.sauer at konghq.com> wrote:
> I've got Pulp set up to serve all the content from S3 behind CloudFront.
> This works really well, except for a minor issue: the content URLs are all
> the UUIDs for artifacts, not, for example, the pretty name of the RPM being
> downloaded. That's an issue in my situation because we'd really like to
> generate download analytics using off-the-shelf tools which consume the AWS
> CDN standard log format.
> My initial thought was that it might be easy to have the redirects include
> a query string in the generated URL which notes the original filename or
> relative path requested. But I don't have sufficiently developed Django
> skills to know the easiest way to do that (or if it's even reasonable to
> think that's easy). Using the content server's logs is another option, but
> I have some other content on the same S3 bucket which may not necessarily
> be reached solely through Pulp's content server, so that means two log
> locations, etc. If it was easy to make Django / Gunicorn log to an S3
> bucket in a manner similar to Cloudfront, that might also be ok.
> Post-processing logs with a series of API calls to work out what artifact
> maps to what repository content would ideally be a last resort.
> Anyone have some great insights which might help me out here? :) If it
> helps, I'm building my own Docker images which ultimately run in EKS. So
> patches / extra modules are an option, but I'd prefer to stay as close to
> vanilla upstream as possible with environment variable-based config
> Pulp-list mailing list
> Pulp-list at redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Pulp-list