PROFILE

You are here:

AXCEL.VIZ.PROFILE function

PROFILE is one of EDA function series. It is built based on pandas-profile Python library which extends the pandas DataFrame for quick data analysis.

For each column the following statistics – if relevant for the column type – are presented in an interactive HTML report:

  • Type inference: detect the types of columns in a dataframe.
  • Essentials: type, unique values, missing values
  • Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range
  • Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
  • Most frequent values
  • Histogram
  • Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
  • Missing values matrix, count, heatmap and dendrogram of missing values
  • Text analysis learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data.
  • File and Image analysis extract file sizes, creation dates and dimensions and scan for truncated images or those containing EXIF information.

Syntax

AXCEL.VIZ.PROFILE(data_primary, [skip], [set_category], [set_text], [set_numerical], [title], [deployment])


The AXCEL.VIZ.PROFILE function syntax has the following arguments:

data_primary Required. This is the primary data which is supposed to be profiled.

skip Optional. It is a vector of variable names which you would like the function to skip for analysis.

set_category Optional. It is a vector of variable names which you would like the function consider them as categorical values.

set_text Optional. It is a vector of variable names which you would like the function consider them as string or text values.

set_numercial Optional. It is a vector of variable names which you would like the function consider them as numerical values.

title Optional. By default, there is no title in the report. Otherwise, you can explicitly define the title of your graph.

deployment Optional. It is the deployment in project/name or owner/project/name format. You need to create a project by logging into your console (https://console.axcel.io) -> Project -> Create Project. After that you can use the project name in your deployment. Please note project and visualization names contain small letters and numbers only. If a project is shared with you, you should use the username of the owner in your deployment. Please visit visualization projects and sharing to learn more about this powerful feature.

when you type =AXCEL.VIZ.ANALYZE in an Excel cell, the IntelliSense guides you through required and optional (shown in [] brackets) inputs. Here are examples.

Please note that since the execution time of this function is sometimes longer than 1 minute and some browsers or Excel in Mac terminate the connection after this time, this function runs asynchronously. It means that the function responds immediately but the process continues in the backend. The final result shows in the console in the task pans or in the plot area if deployment is not chosen.

Example:

In cell A1, you can run:

=AXCEL.DATASETS(“mtcars”)

which pulls mtcars dataset. After that you can run:

=AXCEL.VIZ.PROFILE(A1#)

Which produces this interactive report which you can expand in your browser as follows:

You can add deployment, to add this report for sharing and collaboration in your projects in your console.

See also visualization projects and sharing