Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty files when request logger enabled. #4

Open
monostop opened this issue Apr 17, 2015 · 3 comments
Open

Empty files when request logger enabled. #4

monostop opened this issue Apr 17, 2015 · 3 comments

Comments

@monostop
Copy link

I'm serving dynamically generated files with my WSGI application and are streaming them to the client. When I apply the wsgi-request-logger middleware the files are empty. The reason is that when you are calculating the content type you are calling len on the retval which on a file-like object sets the internal pointer to the end of the file.

# file /__init__.py
51 content_length = content_lengths[0] if content_lengths else len(b''.join(retval))

A minimal example with flask, where the middleware breaks the application:

# file  /main.py
import logging
from flask import Flask, send_file
import requestlogger

app = Flask(__name__)

handler = logging.StreamHandler()
app.wsgi_app = requestlogger.WSGILogger(app.wsgi_app, [handler], requestlogger.ApacheFormatter())

@app.route('/file', methods=['GET'])
def serve_file():
    f = open('data', 'r')

    return send_file(f, mimetype="application/ocet-stream")

if __name__ == '__main__':
    app.run(port=4500, debug=True)
#file /data
Some data...

This application will serves an empty file. Commenting out the middleware serves the file properly.

@pklaus
Copy link
Owner

pklaus commented Apr 19, 2015

Thanks for the detailed bug report. You're right, the behaviour is not optimal and I will improve it. I will fix it in wsgi-request-logger but you can work-around this issue by setting the "Content-Length" header in your application. If you call send_file() with a file name instead of a file-like object, this is done automatically. If you really need to provide send_file() with a file-like object, you can also set the content_length yourself (and you should, if there's any way of knowing it the moment you create the stream to let Browsers display an e.t.a.). In your example, this could be done like this:

@app.route('/file', methods=['GET'])
def serve_file():
    f = open('data', 'rb')
    sf = send_file(f, mimetype="application/ocet-stream")
    sf.headers['Content-Length'] = str(os.path.getsize('data'))
    return sf

Concerning the how-to-fix-it in wsgi-request-logger I'm thinking about something like this:

        if content_lengths :
            content_length = content_lengths[0]
        elif all(hasattr(retval, attr) for attr in ('seek', 'tell')):
            retval.seek(0, 2)
            content_length = retval.tell()
            retval.seek(0)
        elif hasattr(retval, 'file') and hasattr(retval.file, 'seek') and hasattr(retval.file, 'tell'):
            retval.file.seek(0, 2)
            content_length = retval.file.tell()
            retval.file.seek(0)
        else:
            content_length = len(b''.join(retval))

Not optimal yet... If you're really streaming content where you don't know the content size at time when you start serving it (e.g. live content such as mjpeg frames from a web cam), the seek(0,2) will most probably block. So what should the log-file contain for the file size in those cases? I'm not sure.

Let me know what you think and thanks again for reporting the issue.

@AndCycle
Copy link

AndCycle commented May 4, 2017

the alternative is to delay the log until response close, you have to do some counting during the response,
I just hit by this now I am searching an alternative.

@AndCycle
Copy link

AndCycle commented May 4, 2017

simple POC

    def __call__(self, environ, start_response):
        start = clock()
        status_codes = []
        content_lengths = []

        def custom_start_response(status, response_headers, exc_info=None):
            status_codes.append(int(status.partition(' ')[0]))
            for name, value in response_headers:
                if name.lower() == 'content-length':
                    content_lengths.append(int(value))
                    break
            return start_response(status, response_headers, exc_info)
        retval = self.application(environ, custom_start_response)
        try:
            counter = 0
            for i in retval:
                yield i
                counter += len(i)
        finally:
            runtime = int((clock() - start) * 10**6)
            content_length = content_lengths[0] if content_lengths else counter
            msg = self.formatter(status_codes[0], environ, content_length, ip_header=self.ip_header, rt_us=runtime)
            self.logger.info(msg)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants