Edge detection is a fundamental tool in computer vision. By using sharp contrasts in brightness, general outlines and shapes can be found in an image. Often used as a filter to find distinctions between items in a photo, edge detection offers a first stepping stone into object recognition.
While Python remains easy to use, quick to learn, and offers an overabundance of external libraries that can do almost anything, it has one critical weakness: it’s slow.
Of course, to the human eye, its sluggishness seems negligible. Usually Python only lags behind other programming languages by milliseconds; however, when iterating over millions or even billions of data points, it quickly adds up.
NumPy offers a unique solution. While allowing users to still write Python, it converts it into well written C code specifically optimized for numerical analysis.
Using NumPy arrays can increase a script’s performance by one or two magnitudes with very little learning curve. …
On my first job out of college, I was tasked with streamlining how a company made purchases. While a big project that encompassed many factors, such as lead times and order quantities, the most challenging part was determining how many units would sell over time.
With lead times in excess of 2 or 3 months, I needed to accurately predict if a certain product would run out before a new shipment came. It was a balancing act, because if I underpredicted how much product would sell, the company would run out and lose sales. …
Project Gutenberg offers over 60,000 full length books. Wikipedia contains over 55 million unique articles. Wattpad has over 400 million short stories. In the age of the internet, there is no shortage of literature to read.
These numbers, however, are completely overwhelming. A person could spend a lifetime attempting to read the entirety of the internet and never scratch more than a fraction of the surface.
The ocean of written material creates a paradoxical problem: because there’s an overabundance of information, finding relevant information becomes more difficult.
Automatically generating text summarizations may help the problem. Instead of leaving users to skim through large walls of text, presenting a brief summary provides the user with key pieces of information so they can make a more informed decision to continue reading without wasting the time to parse the text themselves. …
K-Nearest Neighbor (KNN) is an easy to understand, but essential and broadly applicable supervised machine learning technique. To understand the intuition behind KNN, examine the scatterplot below. The plot shows the relationship between two arbitrary dimensions, x and y. The blue points represent members of group A and the orange points represent the members of group B. This will represent the training data for KNN.
Retailers have access to an unprecedented amount of shopper transactions. As shopping habits have become more electronic, records of every purchase are neatly stored in databases, ready to be read and analyzed. With such an arsenal of data at their disposal, they can uncover patterns of consumer behavior.
A market basket analysis is a set of affinity calculations meant to determine which items sell together. For example, a grocery store may use market basket analysis to determine that consumers typically buy both hot dogs and hot dog buns together.
If you’ve ever gone onto an online retailer’s website, you’ve probably seen a recommendation on a product’s page phrased as “Customers who bought this item also bought” or “Customers buy these together”. More than likely, the online retailer performed some sort of market basket analysis to link the products together. …
From web development to data science, Python offers an incredibly diverse set of tools. Its easy-to-read syntax and quick learning curve makes it a popular language but it lacks the diverse and beautiful GUI support of web technologies. Anyone who’s used Flask, a popular and lightweight web framework, has probably wondered if they could take the same principles and apply them to desktop app development. The temptation of combining an HTML and CSS frontend with a Python backend is alluring,
A few libraries attempt to achieve this, but lack the customization options and don’t have a large community. …
Writing a News Article Recommender to Reduce Polarization and Radicalization
The popular Netflix documentary, The Social Dilemma, outlines many of the fundamental problems of social media. The film explores how the technology used by many platforms don’t necessarily align to user interests.
In theory, social media companies can create data-driven systems to deliver interesting and engaging content to users. Normally, this is a feature meant to filter irrelevant content, but in practice, it ensures nobody is confronted with opposing views if they don’t want to see them.
As people become increasingly polarized by the content they see online, it’s imperative that social media find a balance between delivering engaging content and actively driving users away from reality. …
With the age of big data comes the age of big geospatial data. Researchers increasingly have access to geographic information which can do anything from track the migration patterns of endangered species to map every donut shop in the country.
To help visualize this information, the python library Cartopy can create professional and publishable maps with only a few lines of code. Built with Matplotlib in mind, its syntax is familiar and easy to understand.
To start, we’ll create the simplest possible world map. Before writing any heavy code, the Cartopy and Matplotlib libraries in python should be installed.
import cartopy.crs as crs
import cartopy.feature as cfeature
import matplotlib.pyplot as…
Spotify presents no shortage of playlists to offer. On my home page right now, I see playlists for: Rap Caviar, Hot Country, Pump Pop, and many others that span all sorts of musical textures.
While many users enjoy going through songs and creating their own playlists based on their own tastes, I wanted to do something different. I used an unsupervised learning technique to find closely related music and create its own playlists.
The algorithm doesn’t need to classify every song nor does every playlist need to be perfect. …