Skip to content

Latest commit

 

History

History
156 lines (132 loc) · 7.14 KB

TODOS.md

File metadata and controls

156 lines (132 loc) · 7.14 KB

Link Organizer

In order to correctly organize the links, we need to correctly classify them into topics. to do so, we have a few options:

  • Train an AI model to classify the links into topics LDA, BERT, gensim
  • Do we train the model at runtime? Where to train it? On what data?
  • Do we use a combination of NLP and a predefined TOPIC_MAP?
    • NLP to classify the links into a topic
    • TOPIC_MAP to map the topic into a category

These questions are critical to making this tool useful and scalable.

At the end of the day, we need to map a link to a topic and a topic to a category.

This can already be done via manually specifying the topic of a link, and specifying the category of a topic.

Categories and map to build LDA model

A scraper and preprocessor will be needed to extract the links and classify them into topics. These are just some links off the top of my head. Can add more as we go.

categories = {
    'programming_languages': ['python', 'javascript', 'ruby', 'cpp', 'java', 'rust', 'go', 'php', 'typescript', 'swift'],
    'devops': ['docker', 'kubernetes', 'jenkins', 'ci_cd', 'ansible', 'terraform', 'prometheus', 'infrastructure_as_code'],
    'machine_learning': ['supervised_learning', 'unsupervised_learning', 'reinforcement_learning', 'deep_learning', 'nlp'],
    'cybersecurity': ['penetration_testing', 'network_security', 'cryptography', 'incident_response', 'malware_analysis'],
    'software_development': ['agile', 'tdd', 'microservices', 'version_control', 'design_patterns'],
    'cloud_computing': ['aws', 'gcp', 'azure', 'serverless', 'cloud_security'],
    'databases': ['sql', 'nosql', 'database_optimization', 'data_warehousing'],
    'networking': ['tcp_ip', 'dns', 'http_https', 'load_balancers', 'osi_model'],
    'operating_systems': ['linux', 'windows', 'macos', 'kernel_architecture'],
    'data_science': ['data_cleaning', 'data_visualization', 'feature_engineering', 'big_data'],
    'web_development': ['frontend', 'backend', 'api_development', 'web_security', 'pwas'],
    'blockchain': ['bitcoin', 'ethereum', 'smart_contracts', 'dapps', 'consensus_algorithms'],
    'artificial_intelligence': ['expert_systems', 'nlp', 'game_ai', 'image_recognition'],
    'mobile_development': ['android', 'ios', 'cross_platform', 'mobile_security']
}
  1. Programming Languages
  1. DevOps
  1. Machine Learning
  1. Cybersecurity
  1. Software Development
  1. Cloud Computing
  1. Databases
  1. Networking
  1. Operating Systems
  1. Data Science
  1. Web Development
  1. Blockchain
  1. Artificial Intelligence
  1. Mobile Development

General Data Processing

Done

  • Add a way to add links manually
  • Add a way to add links from a file
  • Add a way to add links from a directory
  • Add a way to add links from a CSV file
  • Add a way to add links from an Excel file
  • Add a way to add links from a text file

TODO

  • Add a way to add links from a markdown file
  • Add a way to add links from a JSON file
  • Add a way to add links from a YAML file
  • Add a way to add links from a XML file
  • Add a way to add links from a HTML file
  • Add a way to add links from a RSS feed
  • Add a way to add links from a API
  • Add a way to add links from a database
  • Add a way to add links from a website