RevPi Connect loses ethernet after about 6 days. [Solved]

Topics about the Hardware of Revolution Pi
Post Reply
jce
Posts: 2
Joined: 22 Sep 2021, 15:46
Answers: 0

RevPi Connect loses ethernet after about 6 days. [Solved]

Post by jce »

Hi Kunbus community, i hope to be able to learn something about a problem we are facing regarding a Revolution Pi Connect.

After about a week uptime, ethernet communication seems to stop. We replaced the RevPi at the customer site, and brought this one to the office. Here we powered it again and after about a week, ethernet communication stopped again. Both for the internal ethernet adapters as an external USB / Ethernet adapter. HDMI output, keyboard and usb mass storage remain functional. LEDs light up / blink as expected next to the ethernet port, but no IP address is received from DHCP. After a reboot an IP is obtained via DHCP.

The module is used standalone, no expansion cards. Its job is to communicate via RS485 and ethernet.
The kernel log shows some suspicious output that seems to me to point to the ethernet adapter and some ip-internals in the kernel.

Can this be related to some hardware failure? Onboard USB comes before offboard usb and keyboard / mass storage work fine. Are there known issues with this kernel? If so, then it would have been noticed before by more users. Any help or ideas regarding this are welcome.

Suspicous log entries:

Code: Select all

...
Sep 16 07:25:26 RevPi18193 kernel: [64317.280839] smsc95xx 1-1.5.1:1.0 eth1: Failed to read reg index 0x00000114: -110
Sep 16 07:25:26 RevPi18193 kernel: [64317.280848] smsc95xx 1-1.5.1:1.0 eth1: Error reading MII_ACCESS
Sep 16 07:25:26 RevPi18193 kernel: [64317.280853] smsc95xx 1-1.5.1:1.0 eth1: MII is busy in smsc95xx_mdio_read
Sep 16 07:25:26 RevPi18193 kernel: [64317.280859] smsc95xx 1-1.5.1:1.0 eth1: Failed to read MII_BMSR
...
Sep 20 11:53:52 RevPi18193 kernel: [425887.832305] ax88179_178a 1-1.3:1.0 eth2: ax88179 - Link status is: 0
Sep 20 11:53:52 RevPi18193 kernel: [426024.980338] INFO: task kworker/0:1:905 blocked for more than 120 seconds.
Sep 20 11:53:52 RevPi18193 kernel: [426024.980345]       Tainted: G           O    4.9.76-rt60-v7+ #1
Sep 20 11:53:52 RevPi18193 kernel: [426024.980349] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 20 11:53:52 RevPi18193 kernel: [426024.980353] kworker/0:1     D    0   905      2 0x00000000
Sep 20 11:53:52 RevPi18193 kernel: [426024.980375] Workqueue: events linkwatch_event
Sep 20 11:53:52 RevPi18193 kernel: [426024.980395] [<80757530>] (__schedule) from [<807578c0>] (schedule+0x78/0x120)
Sep 20 11:53:52 RevPi18193 kernel: [426024.980404] [<807578c0>] (schedule) from [<8075a09c>] (schedule_timeout+0x2d8/0x4dc)
...
Version:

Code: Select all


uname -a
Linux RevPi18193 4.9.76-rt60-v7+ #1 SMP PREEMPT RT Tue, 12 Mar 2019 15:19:36 +0100 armv7l GNU/Linux
cat /proc/version
Linux version 4.9.76-rt60-v7+ (admin@kunbus.de) (gcc version 8.2.0 (Debian 8.2.0-11) ) #1 SMP PREEMPT RT Tue, 12 Mar 2019 15:19:36 +0100
Attachments
dhcpcd.journal.txt.zip
hdcpcd journal
(1.13 KiB) Downloaded 236 times
test_desk_kern.zip
Complete kern.log
(17.84 KiB) Downloaded 226 times
Last edited by jce on 19 Oct 2021, 12:02, edited 1 time in total.
->Johannes<-

Re: RevPi Connect loses ethernet after about 6 days.

Post by ->Johannes<- »

Hi jce,

briefly summarised for my understanding.

You have exchanged the unit at the customer's and the new unit at the customer's does not cause any problems?
You can only reproduce the problem with this one device?

After consulting our development department, it is not easy to say what the problem is.

It is noticeable that a very old kernel is used here, you could upgrade to Buster for testing and see if the error persists.

Another possibility is a defect in the hardware, as you have already asked.
Other possibilities would be EMC, but you wrote that the error occurs at the customer and at your office. And if I read it correctly, there is no other hardware with a possible EMC load connected or nearby, is there?

Regards

Johannes
User avatar
nicolaiB
KUNBUS
Posts: 870
Joined: 21 Jun 2018, 10:33
Answers: 7
Location: Berlin
Contact:

Re: RevPi Connect loses ethernet after about 6 days.

Post by nicolaiB »

Hi jce,

I remember a similar experience with devices running older image as you do. Please update to our latest image (buster) as your release has already two successor releases (stretch and since a few months buster). You can find the instructions in our tutorial section: https://revolutionpi.com/tutorials/imag ... rect=en_US

The images can be found here: https://revolutionpi.de/tutorials/downl ... evpiimages

BR Nicolai
jce
Posts: 2
Joined: 22 Sep 2021, 15:46
Answers: 0

Re: RevPi Connect loses ethernet after about 6 days.

Post by jce »

Gentlemen, thank you for your time and patience. I did not reply any sooner for i wanted to see the device run fine for some time first.

The device is updated to:

Code: Select all

uname -a
Linux RevPi18193 4.19.95-rt38-v7 #1 SMP PREEMPT RT Tue, 22 Jun 2021 14:13:31 +0000 armv7l GNU/Linux
And it is running without the connection issue for 14 days now. So the issue looks solved with the new firmware.

For reference, a dmesg output of this run is attached to this post. It still shows a few smsc95xx related warnings, so i guess these were not related to the issue. EMC might have been related as the device was deployed in close proximity to some AFEs.
Attachments
dmesg_211019.zip
(8.41 KiB) Downloaded 164 times
Post Reply