Guide: Building Simple Machine Learning Models with Julius


Julius AI stands out for its adeptness in implementing machine learning models. This guide will walk you through the process of building simple machine learning models using Julius, leveraging a bank loan dataset from Kaggle. This dataset aids banks in customizing loan offerings by analyzing client profiles based on demographic and financial indicators.

Data Preparation

Cleaning Column Names

The initial step involved Julius cleaning up column names for aesthetic clarity. Although not essential—since Julius’s underlying LLMs, such as GPT and Claude, can interpret any column nomenclature—it simplifies data management for subsequent exportation and utilization.

Data Description

Upon requesting a data description, Julius impressively detailed each column based solely on their names. This efficiency underscores the extensive training behind its LLMs, evidencing familiarity with financial datasets.

Exploratory Data Analysis

Violin Plots

To analyze the binary outcome variable personal_loan, I generated violin plots. These plots are instrumental in visualizing the distribution of variables relative to the outcome. Despite their inability to conclusively determine significance, they provide insight into potential predictors when combined with other variables.

Note: A violin plot alone does not affirm the predictive value of a variable; however, it aids in visual exploration.

RandomForest Classifier

Another quick exploratory tool is the ad-hoc RandomForest classifier, used here to gauge variable importances. Variable importance refers to a metric that ranks each feature’s influence on the model’s predictive power. It’s calculated based on how much the model’s accuracy decreases when the feature is excluded, offering insight into the most significant predictors.

Note: This method, lacking thorough validation steps, serves primarily for preliminary data exploration.

Model Building

Comprehensive Modeling with Julius

Julius’s capability to concurrently execute multiple machine learning steps is remarkable. It allows for efficient model comparison and adjustment. In the modeling phase, I focused on enhancing model performance through:

  • Stratified Sampling: Essential for dealing with imbalanced datasets. It ensures that the training and test sets contain a proportionate representation of the outcome classes, improving model relevance and predictive accuracy.
  • StandardScaler: Standardization, especially for models sensitive to variable scales like SVC, adjusts features to have a mean of 0 and a standard deviation of 1. This uniformity prevents models from misinterpreting the data due to scale discrepancies.

Additionally, Julius provided a detailed report combining accuracy, recall, precision, and f1 score, offering a comprehensive model performance overview.

Just a short primer on these scores:

  • Accuracy:
    • What it measures: The proportion of all predictions (both positive and negative) that the model gets right.
    • Importance: While it gives an overall effectiveness of the model, it may not be the most reliable metric in imbalanced datasets, where the majority class dominates.
    • Relevance: For a balanced dataset regarding loan acceptance, it can provide a quick snapshot of model performance. However, caution is advised if the dataset has far more rejections than acceptances, or vice versa.
  • Precision (Prediction):
    • What it measures: The proportion of positive identifications that were actually correct.
    • Importance: Indicates the reliability of positive predictions. High precision means a high rate of true positives among positive calls.
    • Relevance: Critical in scenarios where the cost of false positives is high, such as wrongly predicting a customer will accept a loan offer when they won’t.
  • Recall (Sensitivity):
    • What it measures: The proportion of actual positives that were correctly identified.
    • Importance: Shows the model’s ability to capture all relevant cases. High recall means most true positives were identified.
    • Relevance: Especially important if missing a potential loan acceptor is costly. A model that fails to identify potential acceptors is of limited use.
  • F1 Score:
    • What it measures: The harmonic mean of precision and recall, providing a single metric to assess the balance between them.
    • Importance: Since it considers both precision and recall, it’s a better measure than accuracy on imbalanced datasets.
    • Relevance: Useful for comparing models where both false positives and false negatives carry significant consequences. In the context of loan offers, it helps in selecting a model that balances identifying potential acceptors (recall) without overly increasing false positives (precision).

For predicting personal loan offering acceptance, focusing on the F1 score might be particularly insightful. It ensures a balanced evaluation, considering the dataset’s potential imbalance and the significant costs associated with false positives (offering loans to those who won’t accept) and false negatives (missing out on potential acceptors). Precision and recall are also critical, providing detailed insights into the model’s predictive strengths and weaknesses.


Xgboost demonstrated exceptional performance in modeling this dataset, showcasing Julius’s efficiency in handling various machine learning algorithms.

Important Considerations

Understanding the origin and implications of model metrics is crucial. The success of machine learning models heavily depends on feature selection—an aspect often requiring significant industry insight. Julius’s prowess in model facilitation becomes most apparent once relevant features have been identified, enabling rapid model iteration and optimization.


Great guide Antonio, love the attention to details for beginners like me. I was building a time series model last with julius using Advanced Reasoning mode and bro it was insane. My mentor said highest he’s ever see on that dataset’s test benchmarks was 0.80 but Julius got over 0.86 somehow. Literally magic software


Thank you.

I don’t do a lot of time series modeling myself lately, but that’s really great to hear. I wanted to write a guide soon on how to do cross-validation in Julius, and there would be a time series section as well, which requires a different approach like, for example, rolling cross validation.

I think I might do that post next so it complements this one.

1 Like