Communities | AI & Data Science

Join our AI & Data Science community on

Posted Questions

01

Posted by Siddhi Jain - "Data Science is very popular among engineering students. Students start learning by taking online courses and end up doing the same stuff over and over again in the name of the project. They're trying to master everything from Computer Vision to Natural Language Processing. Trying to win a Kaggle competition. Everyone believes that Data Science entails studying a plethora of libraries and functions and algorithms. So, what is it exactly? & what exactly is the job, and could you please describe a typical day in the life of a data scientist? What steps should a beginner take to get their first job in the Data Science field? why companies are not hiring freshers for this profile, could you please explain what's the best way to prepare and land a job in Data Science?"

Answered by Aashray Saini, Data Scientist @HP

02

Posted by Siddharth Jain - "Suppose I have a sentence and in the sentence text, there is a possibility that it could have some sample values or not. E.g sentence: "This is the description text for a certain attribute. This attribute will contain IND, US, UK etc. as values." Is there a way that I can find whether any sentence contains sample values or not? Sample values could be anything like ones given in example or something "1,2,3", basically numbers or characters or words. Let me know in case of any additional information required."

Answered by Bhavika Jain, Tech Leaders Fellow | Graduate Research Scholar at Purdue University

I am assuming the solution needed is in NLP terms. nltk library in Python is the leading platform for such problems. The workflow to identify values from the text would be: 1. Work Tokenization or word segmentation) is the problem of dividing a string of written language into its component words. Library Usage: nltk.word_tokenize Output: ['This', 'is', 'the'....., 'or', 'not']

2. Remove stopwords: Stop words usually refer to the most common words such as “and”, “the”, “a” in a language. Library Usage: nltk.download(“stopwords”)

3. Use regex to identify values Note: Another way to just identify named identities is to use Named Entity recognition. You can refer the following blog: https://link.medium.com/ZLRUH44Q3fb

03

Posted by Chaitanya - "What is the everyday job of a Machine Learning engineer? And do they write models or clean data every day? And how do they decide upon a model? Like if they choose a model X? What is the reason behind it? Lastly, what is the track to become a machine learning engineer."

Answered by Samruddhi Mhatre, Lead Machine Learning Engineer @Omdena