Data Science

Data, Do You Mined?

As I’m approaching the end of my data science journey, I still find it extremely hard to explain to parents , colleagues, and friends (outside the tech field) exactly what I’ll be aiming to do as a data scientist, and quite frankly, I still have a hard time pinpointing it myself. Through my studies, I’ve begun to realize the widespread application of data processing, which also explains why every major tech company wants to hire a data scientist in the first place. But I always end up (half) jokingly answering: “I’m there to turn all that data everyone collects into profit.” I usually get a slight chuckle in response and would still have to explain one possible role for a data scientist–my usual exemplar is the Amazon recommendation system, which a surprising number of people seem to understand. But for once, I’d like to comprehensively answer that question for everyone and myself.



I’ve come to understand that data is much bigger than mere suggestion algorithms,  and I nonetheless marvel and fear the potential that data brings because it will affect every  industry and has the capacity to permanently change the world.


There are many skeptics who argue that whatever changes that big data brings will be offset by economics and government like it “always” did in the past. While humanity did manage to survive and thrive after the agricultural and industrial revolution, it did not do so without bringing significant changes to people’s welfare and not always for the better. We tend to view the past romantically, but the new demands of each technology brought turmoil as well as prosperity. The agricultural revolution afforded humanity to stockpile food, which led to the rise of cities and empires. Yet, along with it also came slaves, fiefdoms, and conquest. Meanwhile, the industrial era is known for exponentially increasing productivity as machines took over much of the intensive labor needed for carpentry and garments. But along with capital gains grew massive economic equality, environmental destruction, and at least on some levels, global warfare.

There are some that argue that in this era of advanced information technology, whatever problems human beings concoct can be solved by furthering technology, and this is true to some extent. Yet, it would be naive to adopt this point holistically because we would ignore the fact that much of the problems solved by technology were created by people’s application of that technology. To put it another way, a pair of scissors can be a tool for both art  and murder. It simply depends on whose hand its in.

On the same note, I’ve seen and read about the wonderful things that data has done for the world. IBM’s Watson is making strides in the medical community while Tesla’s autopilot technology has long surpassed human error. Even on a daily basis, Google’s algorithms optimize to find the best route to work and Facebook somehow connects me to people I forgot I knew. And of course, it would be impossible to mention the Goliath of Amazon, who has quite literally changed the marketplace for both consumers and suppliers.

It has only been of late that the rise of big data and technology has drawn skeptical eyes from both the government and public. The usage of Facebook by Russians to interfere in the 2016 elections in many ways became the catalyst for the US government and public to start distrusting social media. However, those critics of technology are too tunnel-visioned to the downfalls of social media in particular.  Those of the older generation complain about how screen-on-time prove detrimental to the youth’s attention span and mental health, and while this may be true, it remains only at the tip of the iceberg for much to come.

In his book 21 Lessons for the 21st Century, Yuval Noah Harari gives us incredible foresight into the implications of our advancing technologies, especially as it applies to the realm of information and biotechnology.  He imagines a world where algorithms have been refined to such an extent that they’re able to not only predict what we should buy or watch for the sake of profit, but also determine where we should live, our occupation for optimal satisfaction in life, and even our romantic partners. After all, the universality of data means that any algorithms capable of detecting trends would be able to optimize for such things in life.


In terms of biotechnology, data combined with advancements in medical technology might allow us to eventually treat or avoid all ailments. Harari draws images of biosensors, whether external or internal, that consistently measures our mental and physical health and gives us feedback according to our actions. Such machinery could track of foreign bodies entering our bloodstream or give immediate feedback about our vital signs. Thinking about eating a cheeseburger? The sensor may show us how much that would increase our cholesterol levels. Conversely, it could be implemented to help us sleep better and even control our dreams:



Those skeptical by such pervasive use of personal information are wise to be so. As with the scissor, it all depends on who owns the data, and it almost certainly will not be in the hands of the individual. Just as prediction algorithms could predict optimal choices for us, it could also be used against us to force us into these choices. In this Orwellian future, the data’s owner (presumably government) could punish us immediately and arbitrarily for making choices detrimental to the overall system, or at least what they determine is detrimental. Biosensors could alert insurance companies when we are abusing our health, which might trigger a rise in premiums. Or, perhaps like in the film Gataca, we might find discrimination against non-modified humans that still carry long foregone diseases and disorders.

Some may see Harari’s future as exaggerative, and they’re not entirely wrong. Such leaps in technology will be rigorously debated and tested before it is implemented, and depending on the society, it may take decades before biosensors have full implementation. With the looming climate crisis, decades seem like a generous assumption for human society.


The much more immediate impact deals with economy and industry, and midwestern America already feels this pain. The rise of automation is the antithesis to manufacturing jobs. Just as artisans sewing elaborate garments could not compete against the productivity of the shuttle loom, factory workers cannot compete with the productivity of fatigueless machina.  And though the US as a whole has transitioned to a more service-oriented industry, the next sector to feel the brunt of automation will be transportation. Truck drivers, bus drivers, postal and delivery workers, and even ride share will soon have to compete with cheaper, safer, and more productive mechanical counterparts. This will displace millions of workers and upend the economy.


This is not some ambiguous apocalypse written in prophecy, as it is already happening, and many economists warn it will worsen. Even the President’s former Treasury Secretary, Gary Cohn, wanted to make this issue the forefront of his agenda. Unless something is done, millions will be jobless as highly technical jobs require more and more expertise. The current political strategy seems to be protectionist policy for certain jobs (e.g. corn subsidies), but this is obviously inefficient and impractical over the long run as it works against free-market principles. Rather, the US ironically needs to better utilize data to protect its people. Though it may be Andrew Yang’s primary pitch for the presidency, universal basic income may end up being a necessary counterweight to the huge economic disparity that will inevitably arise when automation allows the hands of a few to influence the rest of society. Regardless of whatever policy or approach is taken, it must be implemented soon.

Alas, I must confess that I do not know much about the data science field aside from what I have read. There are some that suggest that data hasn’t come as far as we thought (source), and they may be right. However, there lies my primary motivation and excitement about this field. I truly feel that it will bring both miraculous and catastrophic changes to the world, but unless I learn and apply it firsthand, whatever words and opinions I write will lack credibility. I understand that it is a tall order, since data science is applicable to so many fields; one can spend a lifetime developing new deep learning algorithms without ever touching its application in the field of business or medicine. But I feel that I must start somewhere, and that was the beginning of my data science course.

Somewhat regrettably, I feel like I know less about data than I did before I started learning, but optimistically, that is a symptom for all new learners for only the ignorant believe they’ve truly mastered anything. In the 21st century, understanding and manipulating data will become an increasingly important skill, and I hope this inspires some readers to learn about the subject, no matter where they are in life.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s