Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large POST Data #3

Open
gusmanb opened this issue Jul 1, 2014 · 12 comments
Open

Large POST Data #3

gusmanb opened this issue Jul 1, 2014 · 12 comments
Assignees

Comments

@gusmanb
Copy link

gusmanb commented Jul 1, 2014

I am planning to use this library (tested it and works great) for production and have a doubt.

How will perform this if big files are sent? do all the file data will be in memory?

If yes, is there a "simple" way to redirect large POST data to disk instead to memory?

Thanks.

@mzabani
Copy link
Owner

mzabani commented Jul 2, 2014

Glad you're liking FastCgiNet!
Sadly, the entire request's contents will be kept in memory - including the file's contents.
However, this is undesirable and certainly is a common need for FastCgiNet's users. I'll keep this issue as a feature suggestion and will try to work on it ASAP, which right now might mean about a month until this lands on master.

As for going around this limitation, you could create a class that inherits from ApplicationSocketRequest and override AddReceivedRecord without calling its base implementation. This is not that straightforward because you will have to deal with kinds of records (BeginRequest, EndRequest and others), but you can then read the contents of each record, which is at most 64k long, and deal with it as you wish. Keep in mind that not calling the base method will not append the contents of each record to the streams (Params, Data, Stdin etc.) in your class.

Thank you for the feedback,
Marcelo.

@mzabani mzabani self-assigned this Jul 2, 2014
@gusmanb
Copy link
Author

gusmanb commented Jul 3, 2014

Hi Marcello.

Thanks for your response.

I did revise the code and I'm thinking about modifying the FastCgiStream to fall back to hard disk caching if the written content is bigger than X (1mb per example).

The most hard part I found with this solution is to modify the SocketStream because it uses the underliying record streams, but I see now that it only uses them in Write mode and the hard disk caching will be only for read streams.

What do you think about it?

@mzabani
Copy link
Owner

mzabani commented Jul 6, 2014

Having FastCgiStream fall back to hard disk seems to be the most natural way for this feature to be implemented, although we have to be cautious about how we are going to make this new functionality accessible to the end user, having in mind that:

  • The place where files are created for the data has to be configurable by the user. Perhaps even the names of the created files have to be thought of so that no conflicts may arise. Having an interface such as ITemporaryFilenameGenerator that exposes something like a string GenerateFilename() would be a start; instances that implement this interface would be injected into FastCgiStream's through a constructor, for example.
  • The maximum request size has to be user-configurable as well, a static property would do the trick, although it is desirable that FastCgiStream itself receives this value through its constructors.

I've been pretty busy for some time and will remain so for a few more weeks. If you feel like implementing this functionality, it would be of great help! If not, I'll deal with it as soon as I can.

Thank you for the attention,
Marcelo.

@gusmanb
Copy link
Author

gusmanb commented Jul 7, 2014

Hi Marcelo.

I will try to start to implement it this weekend, I'm integrating this into a bigger project (if it works as expected I will publish it to github, it's a RESTful service framework but 100% compatible with mono) and hope to have finished soon.

I will add to the FastCGI stream an static property named "TempFilenameGenerator" of type ITemporaryFilenameGenerator which will have an instance of a default implementation what will use the Path.GetTempFileName/GetTempFolder to create new files, also, if this property is set to null the hdd support will be disabled.

About the maximum request size, I'm not sure if it's the right location to do it in the stream, maybe a parameter in the ApplicationSocketRequest's constructor is more natural (defaulted to 1Mb per example), what do you think?

Cheers.
Agustín.

@mzabani
Copy link
Owner

mzabani commented Jul 8, 2014

The part with the interface and the static property sounds pretty good to me.

Also, I agree that defining this in a ApplicationSocketRequest's constructor is a good thing, although it still means FastCgiStream will receive this size in one of its constructors too, right?

Another thing to keep in mind is the users that are on the webserver's side. They might need to write large amounts of data to the application. I imagine it could be similar to this implementation in that case, although if you're not interested in doing this part you can leave it to me later (I'll just base myself on your code).

Also, don't be afraid to do things that feel very natural to you. I want this to be a very democratic project. If anything, we can change and tune things later.

@gusmanb
Copy link
Author

gusmanb commented Jul 8, 2014

Hi again.

Indeed Marcelo, the size parameter will propagate between streams via it's constructor.

About the server side, I'm already sending files about 1Gb without problems, I added a function to FastCGIStream to remove underliying streams and modified the SocketStream's Flush function to remove the underliying flushed streams, in this way just flushing each chunk of data sent via the socket stream is enough to send large files efficiently. Also those changes speed up calls between Flush() because it does not need to iterate over flushed streams.

Additionally did some minor changes to speed up a bit the process (removed some control flow through exceptions, changed some socket calls to use other enumeration and so on).

When I have the hdd code I will commit everything.

Cheers.

@mzabani
Copy link
Owner

mzabani commented Jul 9, 2014

That is a very cool way to do this! We will only need to make sure we document this functionality very well later on, but that is something I'll do with pleasure.
So in the end, from the webserver's perspective, the streams are forward only, but the user has to be aware that all data written to them will remain in memory until they are flushed. From the application's perspective, streams are not forward only and there is a filesystem fallback mechanism implemented. It sounds very good overall.
Thank you very much for all the help!

@gusmanb
Copy link
Author

gusmanb commented Jul 16, 2014

Hi Marcelo, I have a rush of work and don't know if I will have time soon to do the modifications.
I did fork the repository and updated it with my changes.

Hope to have time soon.

Cheers

@gusmanb
Copy link
Author

gusmanb commented Aug 10, 2014

Hi Marcelo.

I see you added the hdd fallback for large requests, really nice 👍

I have merged your new changes with the ones removing unnecesary blocks on flush(), It has ben just merged, hope to have a bit of time on Wednesday I will test it extensively and if all is ok will do a pull request.

Cheers.

@mzabani
Copy link
Owner

mzabani commented Aug 11, 2014

Hi gusmanb,

Good thing you're going to take a look at it. I have to admit I haven't tested this new feature very well at all.
One more thing that I'd like to discuss a little bit more is the default maximum in-memory request size. I set 2KB but we should be thinking of the maximum size of a typical form request. The goal should probably be to avoid that a request with the max cookie size, a long URL and lots of form data (but no posted files) gets stored on disk. I'll research on maximum allowed sizes for some of these parameters to come up with a good default.

Good work on your side!

@mzabani
Copy link
Owner

mzabani commented Sep 1, 2014

@gusmanb, how are things doing over there? Any news?

@gusmanb
Copy link
Author

gusmanb commented Sep 2, 2014

Hi Marcelo, sorry for the delay, I'm working on an application for a big
client and I'm a bit stressed, will try to test it this weekend.

Cheers.

2014-09-01 4:20 GMT+02:00 mzabani [email protected]:

@gusmanb https://github.com/gusmanb, how are things doing over there?
Any news?


Reply to this email directly or view it on GitHub
#3 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants