You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing your great work! I would greatly appreciate if you could help me resolve the below issue.
I first tried the CLI interface, and was able to generate 'results.json' and 'vis.json'. However, it didn't allow me to http://localhost:8000/multi_color.html?json=vis.json, so I decided to give Python interface a try.
I am using the below code and parameter configuration to reproduce 'results.json' and 'vis.json'.
import recursiveHierarchicalClustering as rhc
import recursiveHierarchicalClusteringFast as rhcFast
data = rhc.getSidNgramMap(inputPath)
treeData = rhcFast.run(inputPath, data, outPath)
environment: Jupyter Notebook
inputPath: I added your input.txt file to one directory and set inputPath = '/home/chenruihao/test_clustering/input.txt'
outPath: I didn't find description of outPath but found outputPath which is "The directory to place all temporary files as well as the final result.". I suppose outPath and outputPath are both the directory to store output files. so I set outPath = '/home/chenruihao/test_clustering/output/'
I got below error when I try to run the above code:
/home/chenruihao/test_clustering/recursiveHierarchicalClustering.py:247: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
result = np.linalg.lstsq(A, y)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-57-3760f95de317> in <module>
----> 1 treeData = rhcFast.run(inputPath, data, outPath)
~/test_clustering/recursiveHierarchicalClusteringFast.py in run(ngramPath, sid_seq, outPath)
416
417 hc = HCClustering(
--> 418 matrix, sid_seq, outPath, [], idxToSid,
419 sizeThreshold=0.05 * len(sid_seq), idfMap=idfMap)
420 result = hc.runDiana()
~/test_clustering/recursiveHierarchicalClusteringFast.py in runDiana(self)
337 matrix = calculateDistance.partialMatrix(
338 sids,
--> 339 rhc.excludeFeatures(rhc.getIdf(self.sid_seq, sids),
340 newExclusions),
341 ngramPath,
NameError: name 'ngramPath' is not defined
Q1: How may I fix this error?
Fix trial: 'ngramPath' is called in 'recursiveHierarchicalClusteringFast.py', so I hard coded it in the below way:
looks like the run function under ngramPath seems to be the same as sys.argv[1], and by definition, ngramPath is the path to the computed pattern dataset, so I hard code ngramPath = '/home/chenruihao/test_clustering/input.txt', same as the inputPath, but I still got the above error...Would love to hear your thoughts.
Q2 I also want to understand what user_id were clustered, their membership, and their corresponding action-gap-action similar to the issue discussed in another thread. Would it be possible to just use the result.json file to answer my question as well as the question in the above thread, rather than modify the code?
My understanding is that from the result.json, it looks like for each level of cluster,
key = 1 stores the user_ids that were clustered in that level of cluster;
key = 2, exclusions stores the action-gap-action/token members of the cluster
Thanks!
Anthony
The text was updated successfully, but these errors were encountered:
Hi xychang,
Thank you for sharing your great work! I would greatly appreciate if you could help me resolve the below issue.
I first tried the CLI interface, and was able to generate 'results.json' and 'vis.json'. However, it didn't allow me to http://localhost:8000/multi_color.html?json=vis.json, so I decided to give Python interface a try.
I am using the below code and parameter configuration to reproduce 'results.json' and 'vis.json'.
environment: Jupyter Notebook
inputPath: I added your input.txt file to one directory and set inputPath = '/home/chenruihao/test_clustering/input.txt'
outPath: I didn't find description of outPath but found outputPath which is "The directory to place all temporary files as well as the final result.". I suppose outPath and outputPath are both the directory to store output files. so I set outPath = '/home/chenruihao/test_clustering/output/'
I got below error when I try to run the above code:
Q1: How may I fix this error?
Fix trial: 'ngramPath' is called in 'recursiveHierarchicalClusteringFast.py', so I hard coded it in the below way:
run
function under ngramPath seems to be the same as sys.argv[1], and by definition, ngramPath is the path to the computed pattern dataset, so I hard code ngramPath = '/home/chenruihao/test_clustering/input.txt', same as the inputPath, but I still got the above error...Would love to hear your thoughts.Q2 I also want to understand what user_id were clustered, their membership, and their corresponding action-gap-action similar to the issue discussed in another thread. Would it be possible to just use the
result.json
file to answer my question as well as the question in the above thread, rather than modify the code?My understanding is that from the
result.json
, it looks like for each level of cluster,Thanks!
Anthony
The text was updated successfully, but these errors were encountered: