is20002017-blog - Tumblr blog

is20002017-blog · 7 years

Text

Post 10

In our final full week of classes we moved to discuss just how the data that we have now collected, analyzed, and manipulated, can be visualized in a way that makes the most sense to the target market. The concept is nothing new to many of us, who have tried before to present topics in ways that are clear and sensical, although when it comes to numerical data things can certainly be trickier.

The first article I looked at touched on a topic we had gone over in class. When is the right time to implement graphs instead of tables? What is the right kind of visualization to use? The article offered a poignant claim, to stop making charts when a table is better. The author claims that due to the ease of creating charts, due to tools such as Excel, people tend to overuse them. In many cases data can be easily read and understood by those viewing them when just places in easy to follow tables, or in a correctly made visualization. This fits in precisely with the in class comparison of the two graphs, in which it was emphasized to use graphs that tell a story instead of just relay the data. When trends and significance is easily spotted, the graphs are more useful.

The next article moved away from the creation of graphics and towards how graphics can be used in a real world situation. As many companies do currently, Uber has it’s own internal data visualization framework that it utilizes in regards to the driving and transportation infrastructure. The tool, which has recently become available open source, is called deck.gl and has the interesting visualization method of integrating maps. Since most of Uber’s data regards map usage (Pick up, drop off, traffic, ect.) the tool is a flexible way to view large sets of data. The tools that are now available to visualize become more and more complex in order to map new connection that are constantly being formed.

https://qz.com/1107993/data-visualization-stop-making-charts-when-a-table-is-better/

https://techcrunch.com/2017/04/06/ubers-open-source-data-visualization-tool-now-goes-beyond-maps/

0 notes

is20002017-blog · 7 years

Text

Week of 11/13

This week we continued the examination of process and predictive analytics. We moved on towards the process aspects, such as pareto analysis and time charts, which can help to streamline the process of information evaluation and production lines. However, as statistical analytics and the like continue their significance, I once again chose articles regarding to that field of information, as it held more current relevance.

For the first article I found an interesting take on the need for statistical processing in sports, in particular the NBA. The Sixers, although not none for any particular prowess in past seasons, have been known as the NBA team on the forefront of statistical implementation. Statistical predictions can be used to evaluate the incoming talent, and general strengths of a team. Teams have recently began leaning increasingly on the analytics field, with the Sixers alone having a ten member team dedicated solely to analytics. Coming from the first ‘sabermetrics’ of baseball in the 1970s, statistics in sports has become increasingly lucrative. Why not? If a team can be strengthened through mathematical analysis then there’s profit in staffing a statistics staff. Particularly relevant to this class I felt was the line in the article “Everything is trackable”. In information science it is incredibly easy to access massive quantities of information. All of which can be evaluated to provide insight, whether it be to a company or to a sports team.

The second article I felt had relevance to our class was one mixing several topics we have gone through over the past few weeks, security and analytics. The article looks into the GRC – governance, risk, and compliance- market, and how new tools can increase security and reduce risk. One tool that was examined was in “vendor risk management”, which proposes that GRC can be improved by constantly running data which can then be searches for anomalies. The importance of machine learning in this regard, and in turn AI, is stressed, such that the analytics run can evaluate and adapt to systems. Thus learning where weaknesses may arise. http://searcherp.techtarget.com/feature/New-GRC-tools-include-analytics-AI-to-keep-businesses-safe

http://www.philly.com/philly/sports/sixers/philadelphia-76ers-analytics-stats-trust-the-process-sam-hinkie-20171115.html

0 notes

is20002017-blog · 7 years

Text

This week we moved a little past the technical aspects of information science and started to look at the connection to statistics. Utilizing data and trends we looked at how to find standard errors, trends, and to use predictive analytics as a whole. With my specific personal interest in statistics I decided to look into how predictive analytics are being used today to solve issues and track shifts in data.

The first article I found examined how human trends could be evaluated through the use of big data. The particular type of activity monitored was that of medication nonadherence by patients, for whatever reason. Headed by health professionals from the local Brigham and Women’s hospital as well as evaluated at a Healthcare Forum in Boston, the idea is that nonadherence can be predicted and from there an intervention can be staged for the patient if it is needed. According to the article data analytics can be used to find out who is and isn’t taking their medication, as well as who is at risk to deviate from their specific healthcare plan. This idea that simple statistical approaches can be used to predict human behavior is an interesting approach to how predictive analytics, and simple numbers, can be used to evaluate how humans think and act.

The next article was a less human based application of predictive analytics, but instead took a look at how mosquitoes can be tracked in regards to the spreading of the threat of the west nile virus. In this situation predictive analytics are used to track current outbreaks, and predict what may happen next. Utilizing this technology allows for resources to be used more efficiently and effectively. Predictive models from past occurrences, much like past outputs from certain companies, can give great insight into where things might go next. This application of predictive analytics could be used to save lives.

http://www.govtech.com/fs/data/Chicago-Turns-to-Predictive-Analytics-to-Map-West-Nile-Threat.html

http://www.healthcareitnews.com/news/predictive-analytics-can-spot-patients-not-taking-their-medicine

0 notes

is20002017-blog · 7 years

Text

10/28

This week we progressed into information processing, looking further at R and how R and SQL can be used in tandem to create efficient data processing. I have found it difficult to keep up with some of the R and XML aspects of the class, simply due to having never used them before, however, I have enjoyed the skills that I have been learning. For my articles this week I looked into how data can be further processes, and tools that make the process easier.

The first article pertained more directly to the use of more typical statistics as opposed to any particular aspect of programming. Instead, it looked at how statistical analysis can sometimes be improperly read or applied. This is incredibly important to us as information scientists so that we can ensure our data is well processed. An interesting take that I found was that of AI, and machine learning, conclusions in regards to statistics. There is a bit of a debate here, because which AI will ensure no bias and a fair implementation of data, it cannot (for now) detect many of the nuances and reccomendations of a human analyst. Data driven through statistics does not necessarily imply a stronger conclusion, and it still must be analyzed for realistic outcomes.

Next, with the relevant discussion of the possibility of amazon coming to our city, was an article announcing the release of Amazon Aurora PostgreSQL. This service from Amazon claims to deliver “the performance and availability of high-end commercial databases”. As we discussed companies which host the information of other companies through their servers, such as Netflix being run through Amazon, I found it interesting the Amazon has become even stronger in yet another area. This new relational database is to be charged by the hour and instance use in SQL. Since the database is backed by such a large company it is easily scalable, and reliable. The database is said to be compatible with all commonly used language inputs. Not only has the program just become generally available, but several large brands are already implementing it, such as Capital One, known for technology based banking approaches, as well as Verizon, and Nielsen.

https://diginomica.com/2017/10/05/data-drivers-need-insightful-navigators/

http://www.businesswire.com/news/home/20171024006836/en/AWS-Announces-General-Availability-Amazon-Aurora-PostgreSQL

0 notes

is20002017-blog · 7 years

Text

This week led to an introduction to analysis through R. Through downloading and manipulating some basic data we began to look into just how statistical work in R can be done, as well as basic table storage.

Of course, even with that basic introduction I still wondered about the importance of learning such a language. How the skills learned in class could be applied to a real life job or research situation. I started with an article from r-bloggers (which may have a slight bias) that looked to answer questions about what R is and why it is important. The point that stuck with me most is the sheer popularity of R in regards to data science and statistics. As a math major, who looks to later specialize in statistics work, this stuck with me as a reason why R programming could be beneficial to my career. Mathematics especially is moving away from pen and paper work, and the idea of having a system to hold large statistical values and run computations is appealing in regards to statistics. Both google and Facebook hire for R usage, and they’re widely considered as two major technology innovators and collectors in the market today.

More importantly to me than why people use R and other major data analytics is how it has been used in the past. The next article I found looked at companies using analytics and how they use them. Interestingly enough they explained that Walmart was an original adopter of analytics. Which of course makes sense, different locations needed to cater to different needs in order to ensure the highest profit margins. A secondary example, this time involving R, is that of NYU student run Zero Intelligence Agents blog. This blog uses R to sort intelligence regarding troops in Afghanistan in order to examine information regarding conflicts as well as public knowledge for the exact nature of any confrontations.

A final example again harnesses R, this time done through the US government. During the BP oil spill several years ago the Government used R to analyze data that helped to coordinate and allocate resources for the oil spill. The language could eliminate guess work and questioning by running statistical analysis to provide the best estimate through a combination of public and private sector data that was available during the time.

It is interesting to see just how a relatively simple to pick up tool can be applied in large companies, and even large governments. Such skills and knowledge are truly real world applicable, which makes them that much more important to master.

https://www.r-bloggers.com/why-you-should-learn-r-first-for-data-science/

0 notes

is20002017-blog · 7 years

Text

Week of 10/9

As we move further into database creation we have been discussing different ways to construct databases, and what material fits best within them. As we move through the process of creation, such as in assignment 3, it can become difficult to figure out what goes where, and even just where to start with the information we have. It was so interesting to think about the way to categorize information, forming categories and relationships that are typically overlooked in general life.

In relation to our database creation in our assignment we first began with mapping out the relationships in UML. When researching for the assignment just how best to create the class diagram I came across an article discussing an update to SQL server, an SQL tool from Microsoft, which involve graphing tools. Now, these new SQL Server graph databases allow for the nodes to be graphically represented within SQL, which makes the nodes in the table easier to interpret without the need for external tools. It is interesting to consider how being able to better represent the way that data is portrayed in the table. This is especially important to me following our assignment, and realizing the difficulty that can come with processing and displaying data, which really shines a light on the importance of tools catering to readability and aesthetic display.

Moving on from our creation of databases I found an incredibly interesting and relevant article evaluating database creation to monitor the opioid crisis in America. It has been proposed by an Ohio senator to create a database that centralizes information about funding, outbreaks, and research such that those involved in combating the issue have access to all resources. This brings to light one of the most important aspects of database creation. The ability that it has to bring together information from all over, allowing for the most efficiency possible. This database in particular will allow both the public and private researchers to look into up to date information. Thus increasing awareness and education for all interested parties. Not only can the database provide information, but one aspect I found interesting was the ability of such a database to prevent suspicious medical prescriptions. If all attempts to receive opioid prescriptions are monitored it is claimed that addiction can be significantly reduced. This is because the system will flag those who are “doctor shopping” or going to multiple doctors in search of a high strength prescription. The ability to flag is a testament to the strength of data science and evaluation. Simply by seeking out these outliers in information or recurring patterns important information can be gained.

https://www.healthdatamanagement.com/news/congressman-calls-for-hhs-opioid-epidemic-electronic-database

http://searchsqlserver.techtarget.com/tip/SQL-Server-graph-database-tools-map-out-data-relationships

0 notes

is20002017-blog · 7 years

Text

Week of 10/2

This week we looked deeper into just how to go about creating databases and writing in XML. Turning to the actual input side of information science was a new experience, and helped me to start thinking in more of a computer science input.

Although initially I struggled in approaching a more technological approach. However, I enjoyed the way that XML took a somewhat logical and hierarchical approach to storing information. I looked into the streamlining of the information, and how XML stacks up to other markup languages and programming environments. I came across an article, that explained and analyzed the structure of XML. They examined just why XML, and more importantly the structure of the language, is an important facet of data analytics. When you use XML you follow a very structured flow, and all aspects and tags of the language are obvious, searchable and well define. However, other languages have, what the article describes as, a “blob of text”, which can vary by programmer and field. These differences make it increasingly difficult for information to be accessed and sorted. Moreover, across fields the standards can waver, and due to this there is limitations on data mining and novel findings.

These shortcomings are then contrasted with the XML language. Although many languages can be converted to XML, there may not be the same intricacies that are in a native XML file. Thus, the article champions the use of XML, and I found an interesting window into some of the advantages that learning XML can have. Clearly this is a marketable and desirable skill, which increases my desire to learn how to apply it to my research.

The next article I looked into was more in line with the database side of our lesson this week. Although not necessarily focusing on SQL based databases, the article took our use of databases and expanded it into the future of enterprising search databases. These new databases are suggested to marry the concept of data content and text content. This allows for both structured and unstructured content to model information. In fact, Google is also looking at the retrieval of joint database and text work. This comes with a host of issues, from security, to recall tags, to even simply storing such different types of information.

Stemming off of these combination databases is a concept called Linked Enterprise Data. Linked Enterprise Data is intended to bring together may types of external data in order to make it more easily readable by those in an enterprise environment such as a research firm or a data collection agency. Simplifying and streamlining data is important, as referenced in the previous article, and allows for effective use across an organization. There is importance in data integration and unification, as the amount of data available continues to grow it becomes more and more vital for it to be easily complied. Gaining access to more information in a simpler way streamlines, and improves, data analytics, and information science as a whole.

http://www.cmswire.com/information-management/the-next-generation-of-enterprise-search-is-in-sight/

https://www.researchinformation.info/news/analysis-opinion/structured-xml-versus-blob-text

http://www.dataversity.net/learning-about-linked-enterprise-data/

0 notes

is20002017-blog · 7 years

Text

Post 3

https://gcn.com/articles/2017/09/15/nyc-dot-sign-assets-workflow.aspx

https://www.timeshighereducation.com/features/how-do-universities-use-big-data

This week we considered large amounts of data and how they can be categorized and organized. But, data organization is not only necessary for software firms, but can be applied in a multitude of different ways. For instance, when researching the application of SQL servers and relational database examples I came across a recent article regarding New York City’s use of traffic signs. In the past it seems that the city relied on the workers, and their often hard to decipher or inaccurate handwritten notes, to locate various signage throughout the city.

In order to solve the problem, they converted the information into a SQL-based database. This database took the disorderly and individualized notes from before and converted them into an easy to search and filter system. By populating an SQL server, the information is more widely available and easy to decipher than if it had been handwritten. Furthermore, database constructed is able to be conveniently shared. Sharing provides for other departments, such as law enforcement, to look into the information and tie it to traffic citations. Uploading all the data has been used to reduce the amount of conflicts between work orders and the effects of applying a new work order in an existing area.

Moving from the use of databases in city infrastructure, I found an interesting article regarding the use of databases in universities. Schools, such as Georgia State University, are using databases to store and analyze information regarding the students that attend. By looking at the trends in their students there has been a massive push to increase student graduation rates. Their use of information, information that they already had about their students, has increased the graduation rates of those from minority groups and low socioeconomic standing. Their database looked at factors such as major, grades, and classes enrolled in, to examine the path of students in the university. If the students are deviating from their path, or struggling in their classes, the system immediately will connect the student with their academic counselor, allowing them to remediate any issues that they may be having. Other universities are harnessing their big data resources in different ways. For example, Tufts looks at data from their performance, such as graduate school entries and research conducted, to see where they stack up nationally. Looking at their own performance can give them an edge when examining what aspects of their university to increase or reorganize.

Like with all programs that take personal information and uploads and examines it, there has been some hesitation when it comes to using student information in universities. There is an ethical concern in looking at a students background to immediately asses whether or not they are capable of graduation at a specific institution. However, the argument counters that using this information actually places students at the level they will be the most successful, even if it is not the university they believe they should be at.

Database management and information collection has become increasingly practical in the last few years. Applying the information a company already holds, such as street sign location or student backgrounds, can allow for the information to be used best, increasing overall productivity.

0 notes

is20002017-blog · 7 years

Text

Week of 9/18

This week we considered how information can be gathered and evaluated. I found myself curious about how aspects of data collection have evolved as we have entered a truly digital age. Where before information had to be shared in person or on paper, we now have access to e-mail, video calls, and a multitude of online databases that are now available to everyone.

I initially was curious as to just the evolution over a large set of time. From the very invention of internet communication and cell phone. Are we more efficient than before? Or has the technology overwhelmed our ability to properly disseminate between what really matters and what doesn’t. The basics have truly never evolved, an interview has always been a way to gather information, observation has been how all the masters have learned, and document analysis has always been available in libraries. Yet now there is so much more information, and so many ways to share and collect.

Initially I set out to look at opinions on before and after our technical age. However, through just precursory research it seemed instead that our modern era is almost a new age of information research in itself, and that the changes have been so significant that it has become near impossible to compare the sheer quantity of information collection. Instead, the first article that I found takes a look at what it calls “Before Big Data” and “After Big Data”. In just a few years we have entered this technological era of Big Data, where no longer do we look at small information, such as individual company resources, but draw in information from all across the internet. Although more tends to be better, at least in terms of research, it can be overwhelming in terms of research. This era of data catered to a whole new set of skills, many of which we have begun talking about in class, such as SQL storage and public data collection. It has become more than just looking at data, but instead we have an era in which we need to gather the data, clean the data, and from there evaluate it.

Following our modern era, the article began to look at the future of the evolution of technology and data collection. Most interesting to me, was the idea that we have moved into what the article called “Analytics 3.0”. It seems that we are moving from this large-scale collection into instantaneous results and predictive analysis. With so much data available we have reached a point where we don’t even necessarily need to gather and create data, but instead refine and analyze. Called, prescriptive analytics, they take the information that we have been learning to collect, and then apply it to logic and statistics. Data can be instantly recovered, and from there the information can be synthesized through methods that we could have never considered even 10 or 15 years prior.

The methods to gather data are always evolving, expanding options, and allowing for more and more conclusions to be drawn than ever before.

http://taxandbusinessonline.villanova.edu/resources-business/article-business/the-evolution-of-data-collection-and-analytics.html

http://analytics-magazine.org/the-analytics-journey/

0 notes

is20002017-blog · 7 years

Text

Week of 9/11

Oftentimes students consider much of what is taught to have a lack of practical application. And quite honestly much of it does. Looking at this lesson and considering the methods for efficiency and decision making, I was curious as to whether they were commonly used or just wishful thinking.

After reading the article linked concerning the Ishikawa diagram, used to identify production design factors, I saw that it had been referenced in being used in developing the Miata. Considering how to implement the model I came across a webpage (https://www.miata.net/news/nc_debut.html) illustrating a practical application of the diagram. The company picked six key categories to embody the vehicle. Breaking down design into easily pursued attributes ensured a starting point in which no important aspect, such as styling or driving, would be overlooked. In extracting key aspects of the vehicle, the engineers could ensure that the design was true to the roots of the company, as well as fully encapsulating all elements of design that they had deemed to be essential. Though it may not be completely necessary to use this specific fishbone chart to complete a successful project, I found it interesting the idea of compartmentalizing certain important topics, which can more broadly be applied to any design or plan.

However, it is more difficult to find further examples of an exact use of models such as a company reporting use of SECI or stating that they applied the Shannon-Weaver model. Instead it seems that it is taken as a foundation of principles to keep in mind, or to base plans off, or even an unconscious framework used to transfer information. Companies aren’t announcing use of, for example, SECI however they continue to cycle information through each of the categories of knowledge simply through daily research and acquisition of tacit knowledge.

Yet one aspect of discussion this week seemed to present itself frequently in its entire form: decision trees. To put it plainly decision trees seem to just be the simplest way to map out options and choose the most efficient decision for your particular enterprise. Although decision trees can easily become more complicated when important decisions are presented, at the core they simply rely on what decision needs to be made. The application of such a process varies, from investing decisions, to algorithms in programming, to our simple school exercises. The most interesting application I found while researching was that of NASA rovers. In an article examining the future of NASA probes following Cassini (https://www.engadget.com/2017/09/15/cassini-saturn-next-for-nasa/) the idea of running a system of decision trees in order for a rover to make changes based on its surroundings is explained. Instead of trying to develop an AI system, or forcing an analyst at mission control to make every decision, adding more competent decision tree software to rovers allows the computer on-board to determine the best course of action, which then only has to be approved by an engineer on the ground. This application of data trees to major exploration was in how a simple list of decisions can be used in multi-million dollar machines thousands of miles from earth.

0 notes