Xelera

Xelera Decision Tree Inference

Xelera Decision Tree Inference provides FPGA-accelerated inference (prediction) for real-time Classification and Regression applications when high-throughput or low-latency matters. It supports Random Forest, XGBoost and LightGBM algorithms. The user should first train its own model using one of the supported frameworks (scikit-learn, XGBoost, LightGBM, H20.ai and H20 Driverless AI) and then load and run the prediction via a Python call to Xelera Decision Tree Inference Library.

Vendor: Xelera

Last Update: May 03, 2021

Size: 1.25 GB

Container Version: 0.6.0b6drm

Try or Buy

Obtain an entitlement to evaluate or purchase this product.


Begin a free trial and run the application example below.

View and purchase available pricing plans for this application.


Deployment Options

This application is containerized and can be easily run in a few minutes in the cloud, or on-premises.

On Premises
Alveo U50
View & Buy Product
  • Xilinx Runtime: 2020.2
  • Target Platform:
    xilinx_u50_gen3x16_xdma_201920_3
Alveo U200
View & Buy Product
  • Xilinx Runtime: 2020.2
  • Target Platform: xdma_u200_201830_2

Start Evaluation

Follow the instructions based on your deployment method.

On Premises Alveo U200 and Alveo U50

1.

Obtain an Account Access Key

An access key is required to authenticate a user and grant them access to the application based on their entitlements.  To obtain your account access key, follow these steps:

  • Login to the Xilinx App Store Portal
  • Click the button labeled "Manage Account" to view entitlements.
  • Click the "Access Key" link on the left side menu
  • Click the "Create an Access Key" button.
  • Download the resulting file "cred.json" to the location ABC

Note:  The resulting access key will enable all entitlements within your account.  If you have not yet obtained entitlements from the "TRY OR BUY" section above, you must do so before following these steps for generating your access key.


2

Host Setup

The Xilinx Runtime (XRT) host application is supported on Ubuntu 16.04 /18.04 and CentOS 7.x.  With sudo access, use the following command to download and run the setup script:


2.1 Clone GitHub Repository for Xilinx Base Runtime

    git clone https://github.com/Xilinx/Xilinx_Base_Runtime.git 

2.2 Run the Host Setup Script

    cd Xilinx_Base_Runtime
./host_setup.sh -v 2020.2


Note: 

  • Please wait for the installation to complete.  During this time you may need press [Y] to continue the host setup.
  • If you choose to flash the FPGA, you will need to cold reboot the local machine after the installation is completed to load the new image on the FPGA.
  • The script for host setup can be used to setup other versions XRT and shell. Please check https://github.com/Xilinx/Xilinx_Base_Runtime for more details.

Install Docker (If not installed yet)

With sudo access, use the following command to run the utility script to install docker.


2.3 Go to Xilinx_Base_Runtime utilities directory

    cd Xilinx_Base_Runtime/utilities

2.4 Run the Docker installation script

    ./docker_install.sh 

3.

Application Execution

Enter the following commands in a terminal window to run the application:


3.1 Setup Environment Variables by script from Xilinx_Base_Runtime

    source Xilinx_Base_Runtime/utilities/xilinx_docker_setup.sh

3.2 Pull the Docker Image

Alveo U200

    docker pull xilinxpartners/xelera_decision_tree_inference:alveo_u200_2020.2_0.6.0b6drm


Alveo U50

    docker pull xilinxpartners/xelera_decision_tree_inference:alveo_u50_2020.2_0.6.0b6drm

3.3 Run the Docker Image

Alveo U200

    docker run -it --rm $XILINX_DOCKER_DEVICES --mount type=bind,source=${PWD}/cred.json,target=/opt/xelera/cred.json,readonly --name cont-decision-tree-inference xilinxpartners/xelera_decision_tree_inference:alveo_u200_2020.2_0.6.0b6drm /bin/bash .


Alveo U50

    docker run -it --rm $XILINX_DOCKER_DEVICES --mount type=bind,source=${PWD}/cred.json,target=/opt/xelera/cred.json,readonly --name cont-decision-tree-inference xilinxpartners/xelera_decision_tree_inference:alveo_u50_2020.2_0.6.0b6drm /bin/bash .


Description of command arguments:

  • $XILINX_DOCKER_DEVICES - Variables set by the host setup script
  • --mount
    type=bind,source=${PWD}/cred.json, target=/opt/xelera/cred.json,readonly - Map the downloaded cred.json to the container.

3.4 Run the Predict flight delay example application


Follow the instructions below and refer to following documentation here

  • Get a copy of the example scripts provided by Xelera decision Tree Inference GitHub repository and flight delay dataset:
    /app/setup_example_flight_delay.sh
  • Navigate to the folder:
    cd /app/xelera_demo/Tree-Inference/
  • Run the Random Forest multinomial classification (4 classes) with 100 trees and 1000 samples using
    python3 scripts/RF_scikit_flight.py --data_fpath /app/xelera_demo/data/flight-delays/flights.csv --enable_multinomial true --enable_binomial false --enable_regression false --number_of_trees 100 --num_test_samples 1000 --n_loops 1000
  • You will be prompted the accuracy, latency and throughput measures for CPU (SW) and FPGA (HW) inference runs. Note that execution in software might take some time because the test is repeated 1000 times to get accurate timing measurements.

4.

Results

Alveo U200

    root@5a4b0b93569c:/app/xelera_demo/Tree-Inference# python3 scripts/RF_scikit_flight.py --data_fpath /app/xelera_demo/data/flight-delays/flights.csv --enable_multinomial true --enable_binomial false --enable_regression false --number_of_trees 100 --num_test_samples 1000
Loading dataset...
sys:1: DtypeWarning: Columns (7,8) have mixed types.Specify dtype option on import or set low_memory=False.
##############################################
RF Multinomial with Numerical Features
SK multinomial: Start training ...
Model is not available, start training ...
Training_time:       3.1001360011287034 s
max nodes =  503 
model conversion to .xlmodel time: 0.277881772024557 s
Starting SW inference ...
SW mse 5.862252126955988
SW error 4.034
SW accuracy score 0.486
SW predict latency (average on 1000 runs for 1000 samples):  1.20e-01 s
SW predict throughput: 8.35e+03 samples/s
SW Number of features: 10
SW Number of trees: 100
SW Number of classes: 10
[07:24:08:758][   INFO] <XlRfInference> Using device Xilinx - xilinx_u200_xdma_201830_2 [FPGA] (943bff10-53af-4378-cbc4-00efac507c87)
[07:24:09:521][   INFO] <XlRfInference> [DRMLIB] Start Session ..
[07:24:11:740][   INFO] <XlRfInference> [DRMLIB] Done.
Starting HW inference ...
HW mse 5.595176494088458
HW error 3.694
HW accuracy score 0.525
HW predict latency (average on 1000 runs for 1000 samples):  1.70e-03 s
HW predict throughput: 3.10e+06 samples/s
HW Number of features: 10
HW Number of trees: 100
HW Number of classes: 10

Alveo U50

    root@8b417af2b354:/app/xelera_demo/Tree-Inference# python3 scripts/RF_scikit_flight.py --data_fpath /app/xelera_demo/data/flight-delays/flights.csv --enable_multinomial true --enable_binomial false --enable_regression false --number_of_trees 100 --num_test_samples 1000
Loading dataset...
sys:1: DtypeWarning: Columns (7,8) have mixed types.Specify dtype option on import or set low_memory=False.
##############################################
RF Multinomial with Numerical Features
SK multinomial: Start training ...
Model is not available, start training ...
Training_time:       2.8406101659638807 s
max nodes =  503 
model conversion to .xlmodel time: 0.26975053607020527 s
Starting SW inference ...
SW mse 5.862252126955988
SW error 4.034
SW accuracy score 0.486
SW predict latency (average on 1000 runs for 1000 samples):  1.19e-01 s
SW predict throughput: 8.43e+03 samples/s
SW Number of features: 10
SW Number of trees: 100
SW Number of classes: 10
[07:45:41:628][   INFO] <XlRfInference> Using device Xilinx - xilinx_u50_gen3x16_xdma_201920_3 [FPGA] (16208aae-e489-4dcf-c9df-75511b767230)
[07:45:42:288][   INFO] <XlRfInference> [DRMLIB] Start Session ..
[07:45:44:287][   INFO] <XlRfInference> [DRMLIB] Done.
Starting HW inference ...
HW mse 5.595176494088458
HW error 3.694
HW accuracy score 0.525
HW predict latency (average on 1000 runs for 1000 samples):  1.87e-03 s
HW predict throughput: 3.04e+06 samples/s
HW Number of features: 10
HW Number of trees: 100
HW Number of classes: 10