What is text mining? Support your discussion with at least 3 references.

  1. What are the business costs or risks of poof data quality? Support your discussion with at least 3 references.

  2. What is data mining? Support your discussion with at least 3 references.

  3. What is text mining? Support your discussion with at least 3 references.

1(Bala murali). These days and in this technology world, we can pull data from various sources. Be it from social media or from the customer experience etc, we can pull data daily. If we calculate the amount of data, it will be huge in volume. We must collect the data in a proper way so that it can be useful for our reference. This will hold good for the data which has been collected in an appropriate format. Imagine, if the data has not been collected completely or not properly formatted or incomplete. Then the entire data collected will be useless. If we take any decisions based on the improper data, then the outcome will not be as expected as it should be. The bad data or poor data quality will mislead us to a situation where our reputation will be at stake. Let us discuss about the bad data after it has been identified. Once the organization has found that the data quality is poor, they have to deploy resources to correct the same. The time taken for the corrective measures will be wastage for any of the organization. The costs for collecting the data and costs spent for the corrective measures will be calculated and it is a huge loss for the organization. If we take decisions based on the bad data, the customer may lose interest in the organization and may opt for any other alternative (i-mind.se, 2018).

Data mining is a process of finding out the hidden patterns which are available in the different perspectives of various things. The hidden patterns have to be figured out so that it can be used effectively for different purposes. Once after coming out with a solution, it has to be categorized and stored as it is very useful information into a areas like data warehouses etc. The hidden patterns will be used for various analytic purposes, algorithms, taking business decisions wisely and different other benefits. The same will be widely used for minimizing the costs and also at the same time, increasing the revenue to a greater extent. The following steps are involved in the process of data mining.

  1. Extract, convert and store it to a database,
  2. Storage and management of the data in the stored database,
  3. For authorized persons, grant access so that they can analyze,
  4. Presenting the data in a readable form, say as a graph (Rouse, 2017).

Text Mining

It is a process through which one can gather a high quality of information from any text. The text mining is a process of formatting the unstructured information. It will allow the users to identify the information which is hidden in any text and make it as a useful one. Based on the above definition, it may be looking as a pretty simple one. But, it is not that much simple as it looks. It can be decoded using the National Language Processing (NLP). For this reason, this is not that much compatible for most of the technologies (larrobino, 2017).

2 (Abhinay apuri).

Poor Data Quality:

Being Data investigator and main driver examiner I can state and comprehend the significance of value data and how it could profit the organizations to take off quality data. In the current very aggressive market the data administration is the key pointer to maintain the business which can assist organizations with rising or tumble down. Issue with keeping up data quality is an advancing issue that torment different associations, and if IT pioneers don’t figure out how to improve the precision of their data, there could be dead serious results. There are various ways that associations submit blunders with requesting and directing customer data. Human oversight is a noteworthy one. For example, when a customer is balancing a shape on a business’ site, he or she may confer a heedless mistake, for instance, mistaken spelling a word, giving an out of date address or giving the wrong phone number. Once these oversights are added to the system, they can be difficult to alter. They can likewise prompt long haul issues. Organizations depend on precise data to help their showcasing, deals and client benefit endeavors. On the off chance that they don’t have the correct data on their clients, will undoubtedly sit idle pursuing leads that don’t exist. Time, as is commonly said, is cash.

Data Mining and Text Mining:

Text mining and data mining are regularly utilized reciprocally to portray how data or data is prepared. IT stars in the undertaking data world focused on “data mining”, which we can portray as the divulgence of gaining from sorted out (data contained in composed databases or data conveyance focuses.) Today most of open business data is unstructured data; notwithstanding the way that it may in like manner contain numbers, dates and realities in composed fields, unstructured data is consistently message (articles, webpage content, blog sections, et cetera.). The proximity of unstructured data makes it all the more hard to effectively perform data organization practices using standard business information instruments. The revelation of learning sources that contain content or unstructured data is called “content mining”. Thusly, the standard difference between data mining and substance mining is that in content mining data is unstructured.



