# Profiling Target with Density Histograms

Density histograms are quite standard in any book/resource when plotting distributions. To use them in selecting variables gives a quick view on how well certain variable separates the class.

``````## Loading funModeling !
library(funModeling)
data(heart_disease)``````
``plotar(data=heart_disease, str_input="age", str_target="has_heart_disease", plot_type = "histdens")``

Dashed-line represents variable mean.

Density histograms are helpful to visualize the general shape of a numeric distribution.

This general shape is calculated based on a technique called Kernel Smoother, its general idea is to reduce high/low peaks (noise) present in near points/bars by estimating the function that describes the points. Here some pictures to illustrate the concept: https://en.wikipedia.org/wiki/Kernel_smoother

### Relationship with statistical test

Something similar is what a statistical test sees: they measured how different the curves are reflecting it in some statistics like the p-value using in the frequentist approach. It gives to the analyst reliable information to determine if the curves have -for example- the same mean._

``plotar(data=heart_disease, str=c('resting_blood_pressure', 'max_heart_rate'),  str_target="has_heart_disease", plot_type = "histdens")``

And the model will see the same... if the curves are quite overlapped, like it is in `resting_blood_pressure`, then it's not a good predictor as if they were more spaced -like `max_heart_rate`.

• Key in mind this when using Histograms & BoxPlots They are nice to see when the variable:
• Has a good spread -not concentrated on a bunch of 3, 4..6.. different values, and
• It has not extreme outliers... (this point can be treated with `prep_outliers` function present in this package)