Tag Archives: Doris BrokerLoad Error

Doris BrokerLoad Error: quality not good enough to cancel

Brokerload statement

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1/*")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
);

report errors

Task cancelled

type:ETL_ QUALITY_ UNSATISFIED; msg:quality not good enough to cancel

 

Solution:

Generally, there must be a deeper reason for this error
you can see the URL field of the brokerload task through show load

show load warnings on ‘{URL}’
or open the web page directly

the number of fields is inconsistent or other reasons. The fundamental reason
is that the number of fields in some rows in the file to be imported is inconsistent with that in the table, Or the size of a field in some lines of the file exceeds the upper limit of the corresponding table field, resulting in data quality problems, which need to be adjusted accordingly

If you wants to ignore these error data
modify the task statement configuration parameter “Max_ filter_ratio” = “1”

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1/*")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
)
PROPERTIES
(
    "max_filter_ratio" = "1"
);

Doris BrokerLoad Error: No source file in this table [How to Solve]

Brokerload statement

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
);

report errors

Task cancelled

type:ETL_ RUN_ FAIL; msg:errCode = 2, detailMessage = No source file in this table(ads_ user).

Solution:

The data file path in the broker load statement is written incorrectly. What needs to be written is a file, not a directory
this directory is the directory I export the table directly. This cannot be used in broker load, but many files below
will be
hdfs://eoop/user/coue_ data/hive_ db/couta_ test/ader_ lal_ offline_ 0813_ 1

Modify to
hdfs://eoop/user/coue_ data/hive_ db/couta_ test/ader_ lal_ offline_ 0813_ 1/*

that will do

[Solved] Doris BrokerLoad Error: Scan bytes per broker scanner exceed limit: 3221225472

Brokerload statement

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1/*")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
);

report errors

Task cancelled

type:ETL_ RUN_ FAIL; msg:errCode = 2, detailMessage = Scan bytes per broker scanner exceed limit: 3221225472

 

Solution:

The Doris test environment consists of three be nodes, while the Fe configuration is max_bytes_per_broker_Scanner defaults to 3G, and the files to be imported are about 13gb
parameters need to be modified
Fe executes the following dynamic parameter modification command
admin set frontend config ("Max_ bytes_ per_ broker_ scanner" = "5368709120");
is modified to 5g. In this way, the maximum file size that can be imported by the cluster is 5g * 3 (be) = 15GB
execute it again

Doris BrokerLoad Error: quality not good enough to cancel

Brokerload statement

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1/*")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
);

report errors

Task cancelled

type:ETL_ QUALITY_ UNSATISFIED; msg:quality not good enough to cancel

 

Solution:

Generally, there must be a deeper reason for this error
you can see the URL field of the brokerload task through show load

show load warnings on ‘{URL}’
or open the web page directly

the number of fields is inconsistent or other reasons. The fundamental reason
is that the number of fields in some rows in the file to be imported is inconsistent with that in the table, Or the size of a field in some lines of the file exceeds the upper limit of the corresponding table field, resulting in data quality problems, which need to be adjusted accordingly

If  wants to ignore these error data
modify the task statement configuration parameter “Max”_ filter_ ratio” = “1”

LOAD
LABEL gaofeng_broker_load_HDD
(
    DATA INFILE("hdfs://eoop/user/coue_data/hive_db/couta_test/ader_lal_offline_0813_1/*")
    INTO TABLE ads_user
)
    WITH BROKER "hdfs_broker"
(
    "dfs.nameservices"="eadhadoop",
    "dfs.ha.namenodes.eadhadoop" = "nn1,nn2",
    "dfs.namenode.rpc-address.eadhadoop.nn1" = "h4:8000",
    "dfs.namenode.rpc-address.eadhadoop.nn2" = "z7:8000",
    "dfs.client.failover.proxy.provider.eadhadoop" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
    "hadoop.security.authentication" = "kerberos","kerberos_principal" = "ou3.CN",
    "kerberos_keytab_content" = "BQ8uMTYzLkNPTQALY291cnNlXgAAAAFfVyLbAQABAAgCtp0qmxxP8QAAAAE="
)
PROPERTIES
(
    "max_filter_ratio" = "1"
);