# Compass **Repository Path**: trent/compass ## Basic Information - **Project Name**: Compass - **Description**: 罗盘是一个大数据任务诊断平台，旨在提升用户排查问题效率，降低用户异常任务成本。其主要功能特性如下：非侵入式，即时诊断，无需修改已有的调度平台，即可体验诊断效果。支持多种主流调度平台，例如DolphinScheduler、Airflow或自研等。支持多版本Spark、Hadoop 2.x和3.x 任务日志诊断和解析。支持工作流层异常诊断，识别各种失败和基线耗时异常问题。 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 19 - **Created**: 2023-04-06 - **Last Updated**: 2023-04-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Compass [中文文档](README_zh.md) Compass is a big data task diagnosis platform, which aims to improve the efficiency of user troubleshooting and reduce the cost of abnormal tasks for users. The key features: - Non-invasive, instant diagnosis, you can experience the diagnostic effect without modifying the existing scheduling platform. - Supports multiple scheduling platforms(DolphinScheduler, Airflow, or self-developed etc.) - Supports Spark 2.x or 3.x, Hadoop 2.x or 3.x troubleshooting. - Supports workflow layer exception diagnosis, identifies various failures and baseline time-consuming abnormal problems. - Supports Spark engine layer exception diagnosis, including 14 types of exceptions such as data skew, large table scanning, and memory waste. - Supports various log matching rule writing and abnormal threshold adjustment, and can be optimized according to actual scenarios. Compass has supported the concept of diagnostic types:

Diagnostic Dimensions	Diagnostic Type	Type Description
Failure analysis	Run failure	Tasks that ultimately fail to run
	First failure	Tasks that have been retried more than once
	Long term failure	Tasks that have failed to run in the last ten days
Time analysis	Baseline time abnormality	Tasks that end earlier or later than the historical normal end time
	Baseline time-consuming abnormality	Tasks that run for too long or too short relative to the historical normal running time
	Long running time	Tasks that run for more than two hours
Error analysis	SQL failure	Tasks that fail due to SQL execution issues
	Shuffle failure	Tasks that fail due to shuffle execution issues
	Memory overflow	Tasks that fail due to memory overflow issues
Cost analysis	Memory waste	Tasks with a peak memory usage to total memory ratio that is too low
Cost analysis	CPU waste	Tasks with a driver/executor calculation time to total CPU calculation time ratio that is too low
Efficiency analysis	Large table scanning	Tasks with too many scanned rows due to no partition restrictions
	OOM warning	Tasks with a cumulative memory of broadcast tables and a high memory ratio of driver or executor
	Data skew	Tasks where the maximum amount of data processed by the task in the stage is much larger than the median
	Job time-consuming abnormality	Tasks with a high ratio of idle time to job running time
	Stage time-consuming abnormality	Tasks with a high ratio of idle time to stage running time
	Task long tail	Tasks where the maximum running time of the task in the stage is much larger than the median
	HDFS stuck	Tasks where the processing rate of tasks in the stage is too slow
	Too many speculative execution tasks	Tasks in which speculative execution of tasks frequently occurs in the stage
	Global sorting abnormality	Tasks with long running time due to global sorting

## Get Started ### 1. Compile ``` git clone https://github.com/cubefs/compass.git cd compass mvn package -DskipTests ``` ### 2. Configure ```shell cd dist/compass vim bin/compass_env.sh # Scheduler MySQL export SCHEDULER_MYSQL_ADDRESS="ip:port" export SCHEDULER_MYSQL_DB="scheduler" export SCHEDULER_DATASOURCE_USERNAME="user" export SCHEDULER_DATASOURCE_PASSWORD="pwd" # Compass MySQL export COMPASS_MYSQL_ADDRESS="ip:port" export COMPASS_MYSQL_DB="compass" export SPRING_DATASOURCE_USERNAME="user" export SPRING_DATASOURCE_PASSWORD="pwd" # Kafka export SPRING_KAFKA_BOOTSTRAPSERVERS="ip1:port,ip2:port" # Redis export SPRING_REDIS_CLUSTER_NODES="ip1:port,ip2:port" # Zookeeper export SPRING_ZOOKEEPER_NODES="ip1:port,ip2:port" # Elasticsearch export SPRING_ELASTICSEARCH_NODES="ip1:port,ip2:port" ``` ### 3. Deploy ``` ./bin/start_all.sh ``` ## Documents [architecture document](document/manual/architecture.md) [deployment document](document/manual/deployment.md) ## User Interface ![overview](document/manual/img/overview.png) ![overview-1](document/manual/img/overview-1.png) ![tasks](document/manual/img/tasks.png) ![onclick](document/manual/img/onclick.png) ![application](document/manual/img/application.png) ![cpu](document/manual/img/cpu.png) ![memory](document/manual/img/memory.png) ## License Compass is licensed under the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0) For detail see [LICENSE](LICENSE) and [NOTICE](NOTICE).