-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multithreaded #333
Comments
(On the issue of CachedLAPAccess)
I will add more (detailed) comments later, but for now, I believe that main gain of cached LAP comes from detecting / saving common calls and not-calling underlying tool (so saves time). This is possible because we only keep one instance (global). If we move to "local" one, then I would say we should use normal (non-cached) LAP. Which will simply use less memory. CachedLAPAccess saves time by using up more memory (storing CASes) ... (I am assuming that the same sentence can appear in other fragments, too.) I will check the class in terms of thread-safety, and then get back to you again. |
I do not know if this is the right way to solve this. I think the better way is to have a local cached one because our webserver runs without any shout down for month. If we cache the growing dataset in a global collection than we could run some day out of memory. ... As we only need to build the whole graph in one call I think it is enough to have the cache local. Would it be possible to have the implementations of the laps global and use the cache in a local way? ... I think we could modify the classes in this way. I could make the adaption with Kathrin when she is next time in our office. |
Oh, now I see more clearly about what you mean by "global" and "local". I completely agree. CachedLAP is only required when you build a graph; and once the graph is stable, you can throw out the CachedLAP instance (and all its memory). When you next requires to expand the graph, only then you need another (a new, clean) instance of CachedLAP. So in this sense, you should only activate one CachedLAP instance when you build / extend entailment trees. I will make cached LAP more thread-safe and get back to this comment once more... |
I added the final keyword in the fragment annotators as requested. |
I've updated CachedLAP as thread-safe as much as I can see. #336 Underlying LAPAccess is not thread safe (which would not be called much, since we are using cache), and I've surrounded it with synchornized method; and I've updated all instance variables to be accessed in a way that is thread safe. I do not know the context of thread-safe attempt --- but just let me outline that, many other codes (as I see) of TL is not thread-safe. For example, codes like Entailment graph generations are not thread safe, and I guess it would be very hard to make all of them thread-safe (unless you do simply blocking such as wrap everything in sync, etc). So, :-) I am assuming that you are trying to make codes that are related to USE-case 2 "usage" of the already built graph --- but not on building parts. Is this correct? :-) Just wanted to know this is the case. Anyhow, CachedLAPAccess would be now thread-safe. --- eh, or so I believe. |
I think it is no big deal to make the resources thread safe. It would be an advantage to initialize some classes only once.
These classes are thread safe:
KeywordBasedFragmentAnnotator new
SentenceAsFragmentAnnotator
They only have one final variable and that one is thread safe
more information here http://www.javamex.com/tutorials/synchronization_final.shtml
These classes are also thread safe but could be implemented in a better way:
TokenAsFragmentAnnotator
DependencyAsFragmentAnnotator
tokenPOSFilter, dependencyTypeFilter, governorPOSFilter, dependentPOSFilter - These variables could be final because the access is only a read. These methods are used by the maps "contain" and "isEmpty".
These classes have to be modified:
CachedLAPAccess
This class is not thread safe. Main modification should be done by converting the Hashmap into an ConcurrentHashMap. There are some other variables that maybe has to be discussed if they have to be uniq for each call. Are the stored information unique for a fragment graph? Is it good to have a global CachedLAPAccess or should we have global Lap implementations and use CachedLAPAccess only in local context?
The text was updated successfully, but these errors were encountered: