Tag Archives: Scala

[Solved] Spark Error: org.apache.spark.SparkException: A master URL must be set in your configuration

Error when running the project to connect to Spark:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/10/08 21:02:10 INFO SparkContext: Running Spark version 3.0.0
22/10/08 21:02:10 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:380)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:120)
	at test.wyh.wordcount.TestWordCount$.main(TestWordCount.scala:10)
	at test.wyh.wordcount.TestWordCount.main(TestWordCount.scala)
22/10/08 21:02:10 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:380)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:120)
	at test.wyh.wordcount.TestWordCount$.main(TestWordCount.scala:10)
	at test.wyh.wordcount.TestWordCount.main(TestWordCount.scala)

Process finished with exit code 1

Solution:

Configure the following parameters:

-Dspark.master=local[*]

Restart IDEA.

[Solved] kafka startup Error: ERROR Fatal error during KafkaServer startup. Prepare to shutdown

1. Error Message:

ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.InconsistentBrokerIdException: Configured broker.id 0 doesn’t match stored broker.id Some(1) in meta.properties. If you moved your data, make sure your configured broker.id matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs).
at kafka.server.KafkaServer.getOrGenerateBrokerId(KafkaServer.scala:793)
at kafka.server.KafkaServer.startup(KafkaServer.scala:221)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)

 

2. Casue
The id value inside meta.properties (path: /opt/kafka/logs) does not match the broker.id in server.properties of /opt/kafka/config.

The cause is this: due to the mistaken deletion of the file on linux, so that _cd_ such a command can not be used, but fortunately there are other nodes can be used, after some backtracking, successfully run zookeeper, however, when running kafka, reported an error ERROR Fatal error during KafkaServer Prepare to shutdown, the first line of the error message is as follows.

The reason for this is that the id value in the meta.properties (path: /opt/kafka/logs) does not match the broker.id in the server.properties in /opt/kafka/config.
This is after I modified it, originally it was broker.id=1

 

3. System prompt

    [2022-06-18 14:32:02,309] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)kafka.common.InconsistentBrokerIdException: Configured broker.id 0 doesn’t match stored broker.id Some(1) in meta.properties. If you moved your data, make sure your configured broker.id                      matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs).        at kafka.server.KafkaServer.getOrGenerateBrokerId(KafkaServer.scala:793)        at kafka.server.KafkaServer.startup(KafkaServer.scala:221)        at kafka.Kafka$.main(Kafka.scala:109)        at kafka.Kafka.main(Kafka.scala)[2022-06-18 14:32:02,323] INFO shutting down (kafka.server.KafkaServer)[2022-06-18 14:32:02,354] INFO [feature-zk-node-event-process-thread]: Shutting down (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)[2022-06-18 14:32:02,360] INFO [feature-zk-node-event-process-thread]: Stopped (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)[2022-06-18 14:32:02,383] INFO [feature-zk-node-event-process-thread]: Shutdown completed (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)[2022-06-18 14:32:02,406] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)[2022-06-18 14:32:02,568] INFO Session: 0x1000059c2cd0000 closed (org.apache.zookeeper.ZooKeeper)[2022-06-18 14:32:02,583] INFO EventThread shut down for session: 0x1000059c2cd0000 (org.apache.zookeeper.ClientCnxn)[2022-06-18 14:32:02,588] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)[2022-06-18 14:32:02,621] INFO App info kafka.server for 0 unregistered (org.apache.kafka.common.utils.AppInfoParser)[2022-06-18 14:32:02,624] INFO shut down completed (kafka.server.KafkaServer)[2022-06-18 14:32:02,625] ERROR Exiting Kafka. (kafka.Kafka$)[2022-06-18 14:32:02,655] INFO shutting down (kafka.server.KafkaServer)

4. Solution
Found the reason, broker=0 and broker.id=1 modified to the same value, and then restart, ERROR Fatal error during KafkaServer startup.

[Solved] Kafka Error: kafka.common.InconsistentClusterIdException…

1. Background

Kafka’s physical machine is unexpectedly down, resulting in Kafka’s failure to start

2. Details of error report

[2022-08-09 08:20:42,097] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.InconsistentClusterIdException: The Cluster ID 123456 doesn't match stored clusterId Some(456789) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
	at kafka.server.KafkaServer.startup(KafkaServer.scala:235)
	at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
	at kafka.Kafka$.main(Kafka.scala:82)
	at kafka.Kafka.main(Kafka.scala)

3. Solution

It can be seen that the error is obvious. The cluster-id is not correct. At this time, we can modify the configuration of meta.properties

# The location of the meta.properties file can be found based on the value of the log.dirs parameter in the server.properties configuration file
vim meta.properties

cluster.id=123456

Then it can be started normally

SQL Server Error: Arithmetic overflow error converting expression to data type int.

1. Problem description

SQL Server (SQL DW) queries the number of data in a table and reports an error using count

select count(*)  from test.test_t;

Then an error is reported:

SQL ERROR [8115] [S0002]: Arithmetic overflow error converting expression to data type int.

2. Cause of the problem

The amount of data is relatively large. The query result directly with count is of type int, which exceeds the range of int.

tinyint: integer from 0 to 255
smallint: integer from – 2 15 (-32768) to 2 15 (32767)
int: integer from – 2 31 (-2147483648) to 2 31 (2147483647)
bigint: integer data (all numbers) from -2 63 (-9223372036854775808) to 2 63 -1 (9223372036854775807) decimal: numeric data with fixed precision and range
from -10 38 -1 to 10 38 -1

 

3. Solution

Microsoft sql provides count_big method to count

select count_big(*)  from test.test_t;

[Solved] ERROR SparkContext: Error initializing SparkContext. java.lang.IllegalArgumentException: System memo

During the actual operation of the recommendation system, the following errors occur when executing the featureengineering (Scala) file:
ERROR SparkContext: Error initializing SparkContext. java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200

Solution:
1. Click Edit configurations

2 If your setting interface has the following red box, paste the following text directly

-Xms128m -Xms512m -XX:MaxPermSize=300m -ea


3. If not, click Modify options first, then add VM options, as shown in the figure below, and then complete the second step

this problem should be solved.

IDEA Create Scala Project Error: scalac: Error: Error compiling the sbt component ‘compiler-interface-2.10.0-52.0‘

Error Messages:

scalac: Error: Error compiling the sbt component ‘compiler-interface-2.10.0-52.0’
sbt.internal.inc.CompileFailed: Error compiling the sbt component ‘compiler-interface-2.10.0-52.0’
at sbt.internal.inc.AnalyzingCompiler$.handleCompilationError$1(AnalyzingCompiler.scala:424)
at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$5(AnalyzingCompiler.scala:441)
at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$5$adapted(AnalyzingCompiler.scala:436)
at sbt.io.IO$.withTemporaryDirectory(IO.scala:490)
at sbt.io.IO$.withTemporaryDirectory(IO.scala:500)
at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$2(AnalyzingCompiler.scala:436)
at sbt.internal.inc.AnalyzingCompiler$.$anonfun$compileSources$2$adapted(AnalyzingCompiler.scala:428)
at sbt.io.IO$.withTemporaryDirectory(IO.scala:490)
at sbt.io.IO$.withTemporaryDirectory(IO.scala:500)
at sbt.internal.inc.AnalyzingCompiler$.compileSources(AnalyzingCompiler.scala:428)
at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl$.org$jetbrains$jps$incremental$scala$local$CompilerFactoryImpl$$getOrCompileInterfaceJar(CompilerFactoryImpl.scala:154)
at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.$anonfun$getScalac$1(CompilerFactoryImpl.scala:54)
at scala.Option.map(Option.scala:242)
at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.getScalac(CompilerFactoryImpl.scala:47)
at org.jetbrains.jps.incremental.scala.local.CompilerFactoryImpl.createCompiler(CompilerFactoryImpl.scala:25)
at org.jetbrains.jps.incremental.scala.local.CachingFactory.$anonfun$createCompiler$3(CachingFactory.scala:24)
at org.jetbrains.jps.incremental.scala.local.Cache.$anonfun$getOrUpdate$2(Cache.scala:20)
at scala.Option.getOrElse(Option.scala:201)
at org.jetbrains.jps.incremental.scala.local.Cache.getOrUpdate(Cache.scala:19)
at org.jetbrains.jps.incremental.scala.local.CachingFactory.createCompiler(CachingFactory.scala:24)
at org.jetbrains.jps.incremental.scala.local.LocalServer.doCompile(LocalServer.scala:43)
at org.jetbrains.jps.incremental.scala.local.LocalServer.compile(LocalServer.scala:30)
at org.jetbrains.jps.incremental.scala.remote.Main$.compileLogic(Main.scala:207)
at org.jetbrains.jps.incremental.scala.remote.Main$.$anonfun$handleCommand$1(Main.scala:190)
at org.jetbrains.jps.incremental.scala.remote.Main$.decorated$1(Main.scala:180)
at org.jetbrains.jps.incremental.scala.remote.Main$.handleCommand(Main.scala:187)
at org.jetbrains.jps.incremental.scala.remote.Main$.serverLogic(Main.scala:163)
at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain(Main.scala:103)
at org.jetbrains.jps.incremental.scala.remote.Main.nailMain(Main.scala)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.martiansoftware.nailgun.NGSession.run(NGSession.java:319)

Reason: The configuration scala.version in pom.xml is inconsistent with the locally installed Scala version, which is: 2.12.15.

Solution: Replace the version number.

[Solved] Spark Error: ERROR StatusLogger No log4j2 configuration file found

I. introduction

When running Kafka related procedures of spark # project, it was warned that although it did not affect the operation, OCD looked very uncomfortable, so it was cleared immediately.

ERROR StatusLogger No log4j2 configuration file found. 
Using default configuration: logging only errors to the console.

II. Problem-solving

1. Add log4j2.xml

Level can be configured in loggers

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{YYYY-MM-dd HH:mm:ss} [%t] %-5p %c{1}:%L - %msg%n" />
        </Console>

        <RollingFile name="RollingFile" filename="log/test.log"
                     filepattern="${logPath}/%d{YYYYMMddHHmmss}-fargo.log">
            <PatternLayout pattern="%d{YYYY-MM-dd HH:mm:ss} [%t] %-5p %c{1}:%L - %msg%n" />
            <Policies>
                <SizeBasedTriggeringPolicy size="10 MB" />
            </Policies>
            <DefaultRolloverStrategy max="20" />
        </RollingFile>

    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console" />
            <AppenderRef ref="RollingFile" />
        </Root>
    </Loggers>
</Configuration>

2. Add location

Add it to the Src/main/Resources folder and execute MVN install

3. Display

Since the level in the file is set to info, you can see many related logs and modify them yourself

[Solved] A needed class was not found. This could be due to an error in your runpath. Missing class: scala/co

A needed class was not found. This could be due to an error in your runpath. Missing class: scala/collection/GenTraversableOnce$class
java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
	at com.twitter.util.JMapWrapper.<init>(LruMap.scala:33)
	at com.twitter.util.LruMap.<init>(LruMap.scala:41)
	at com.twitter.util.LruMap.<init>(LruMap.scala:44)
	at

pay attention to your POM file, confirm the version and some third-party dependent versions. Please use the newer version. Delete Idea, there may be some caches,

Hadoop reports an error. Cannot access scala.serializable and python MapReduce reports an error

Record the problems encountered when doing school Hadoop homework. The homework is more basic, that is, calling Hadoop through makefile to execute the MapReduce program written in advance

Error 1

An error occurred in the Hadoop wordcount code

java: cannot access scala.Serializable class file for scala.Serializable not found

An error is reported

Solution:
through this Q & A on stack overflow, I guess that the scala version is incompatible with the Hadoop version, so rollback to 2.7 will solve the problem

Error report 2

Attempting to run Python on Hadoop. But an error is reported. The error information is not detailed:
insert a picture description here

solution:
add the following at the beginning of the source code:

#!/usr/bin/env python
# -*-coding:utf-8 -*

(the problem with the coding format is really that I don’t know how to debug it.)

How to Solve Show() error caused by empty data

In order to solve the show() error caused by empty data, it is used during filtering! x. Isnullat (1) determines whether it is empty. If it is empty, it will be discarded

 

//Filter .getDouble(1) 1 refers to the first column, starting from 0
    
    DF.filter(x => !x.isNullAt(1) && x.getDouble(1) < 1995).show(10)
    

Maven compiles Scala and reports an error stackoverflowerror

Error during Maven compilation: java.lang.stackoverflowerror

preface

Most of this error is caused by the java thread stack, but it is not caused by this reason. I don’t know if I’ve heard of [in scala-2.10. X version, when there are more than 22 elements of case class, an error will be reported after compilation]. I really do this because there are more than 130 member variables in a case class, But mine is Scala_ 2.11 so I don’t think the problem is caused by the version. During the experiment, when the membership is limited to about 100, it’s OK. Of course, I’m lazy to disassemble the case class

Online solution (my solution)

Another problem on the Internet is to solve my problem. This method is to add the configuration parameters directly to the POM file

<plugin>
    <groupId>net.alchim31.maven</groupId>
    <artifactId>scala-maven-plugin</artifactId>
    <version>3.4.0</version>
    
    <!-- Add-->
    <configuration>
        <displayCmd>true</displayCmd>
        <jvmArgs>
            <jvmArg>-Xss20m</jvmArg>
        </jvmArgs>
    </configuration>
    
    
    <executions>
        <execution>
            <goals>
                <goal>compile</goal>
                <goal>testCompile</goal>
            </goals>
        </execution>
    </executions>
</plugin>