Creating Own AI Datasets from Different Language Sources Efficiently: Advantages, Disadvantages and Best Pracitces to Make Own Datasets
Publication Name: Cando EPE 2023 Proceedings IEEE 6th International Conference and Workshop Obuda on Electrical and Power Engineering
Publication Date: 2023-01-01
Volume: Unknown
Issue: Unknown
Page Range: 155-158
Description:
Artificial Intelligence (AI) has become a ubiquitous technology that has the potential to revolutionize many industries. However, AI requires access to vast amounts of data to achieve its full potential. By understanding these data sources and the best practices for working with them, we can unlock the full potential of AI and create groundbreaking solutions that benefit society as a whole. Data sources can be utilized to train AI algorithms, including structured and unstructured data, but finding relevant datasets is challenging. The main problem is the international sources. Most of the datasets are available only in English or other common languages. However these are good sources, but translating them is expensive and time-consuming. This paper shows how to generate own datasets from open-source data with translation.
Open Access: Yes