in

How you can Robotically Extract and Label Information Factors on a Seaborn KDE Plot | by Lee Vaughan | Sep, 2023


DALL·E 2023— An impressionist portray of an undulating mountain vary with brightly coloured circles alongside the ridgeline (all remaining photos by the creator).

A Kernel Density Estimate plot is a technique — just like a histogram — for visualizing the distribution of knowledge factors. Whereas a histogram bins and counts observations, a KDE plot smooths the observations utilizing a Gaussian kernel. As alternate options to histograms, KDEs are arguably extra engaging, simpler to match in the identical determine, and higher at accentuating patterns in knowledge distributions.

A histogram versus a KDE plot

Annotating statistical measures just like the imply, median, or mode on KDEs makes them extra significant. Whereas including traces for these measures is simple, making them look clear and uncluttered shouldn’t be.

Marker traces added with the simple methodology (left) vs. with the tougher however extra engaging methodology (proper)

On this Fast Success Information Science undertaking, we’ll use US Census and Congressional datasets to programmatically annotate a number of KDE plots with median values. This strategy will be sure that the plot annotation routinely adjusts for updates to the datasets.

For extra particulars on KDE plots, see my earlier article here.

As a result of the US has Age of Candidacy laws, the birthdays of members of Congress are a part of the general public report. For comfort, I’ve already compiled a CSV file of the names of the present members of Congress, together with their birthdays, department of presidency, and get together, and saved it on this Gist.

For the US inhabitants, we’ll use the Census Bureau’s Monthly Postcensal Civilian Population desk for July 2023. As with the earlier dataset, that is public data that I’ve saved to a CSV file on this Gist.

For this undertaking, we’ll want to put in seaborn for plotting and pandas for knowledge evaluation. You possibly can set up these libraries as follows:

With conda: conda set up pandas seaborn

With pip: pip set up pandas seaborn


Spatial Knowledge Engineering with Typescript | by Sutan Mufti | Sep, 2023

A Information to Actual-World Knowledge Assortment for Machine Studying | by Leah Berg and Ray McLendon | Sep, 2023