Nusacular (Nusantara Vernacular)

This project aims to develop a system capable of detecting regional languages or dialects from text. We will collect a dataset containing text in various regional languages or dialects and train machine learning models to recognize and classify them. The outcome will be useful for text processing applications that require an understanding of regional languages and for the preservation of local culture.

📷 Features

Regional Language Detection:

Made using a Naive Bayes classifier and a TF-IDF vectorizer. The model is trained using a dataset from NusaX containing 10,000 sentences in 10 regional languages. The model is able to detect 10 regional languages correctly.

Chatbot:

Made using the Gemini (Google Generative AI) API. The user can chat with the chatbot in the context of regional languages.

Regional Language Translation:

Made using the GoogleTrans library. The library is able to translate from Indonesian into Javanese and Sundanese.

Text to Speech:

Made using the gTTS (Google Text-to-Speech) library. The library is able to convert text into speech in Javanese and Sundanese languages.

Sentiment Analysis:

The feature is based on this notebook. Users can input a sentence, and the model will predict the sentiment of the given sentence.

🎌 Supported Regional Languages:

Acehnese
Balinese
Banjarese
Buginese
Javanese
Sundanese
Madurese
Minangnese
Ngajunese
Toba Batak

🔮 Future Improvements:

Increase model accuracy
Add more language
Feature to create an account
Use an online database

✨ Author:

Name	NIM
Bima Rakajati	A11.2020.13088
Enrico Zada	A11.2020.12972
Rosalia Natal Silalahi	A11.2020.13084
Devi Kartika Sari	A11.2020.12518

📙 Reference:

https://github.com/akhiilkasare/Language-Detection-Using-NLP-and-Machine-Learning
https://github.com/IndoNLP/nusax

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Nusacular (Nusantara Vernacular)

📷 Features

Regional Language Detection:

Chatbot:

Regional Language Translation:

Text to Speech:

Sentiment Analysis:

🎌 Supported Regional Languages:

🔮 Future Improvements:

✨ Author:

📙 Reference:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Nusacular (Nusantara Vernacular)

📷 Features

Regional Language Detection:

Chatbot:

Regional Language Translation:

Text to Speech:

Sentiment Analysis:

🎌 Supported Regional Languages:

🔮 Future Improvements:

✨ Author:

📙 Reference: