- Печать
Страницы: [1] 2 Все Вниз
Тема: Ошибка шины (Прочитано 7972 раз)
0 Пользователей и 1 Гость просматривают эту тему.
Tupas
Значит, ввожу в консоли sudo apt-get install любое_имя_пакета, а в ответ получаю кучу текста такого вида:
[ 2927.929002] ata1.00: status: { DRDY ERR }
Что делать?
[ 2927.942856] ata1.00: error: { UNC }
[ 2931.680190] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 2931.694304] ata1.00: BDMA stat 0x25
[ 2931.708326] ata1.00: failed command: READ DMA
[ 2931.722101] ata1.00: cmd c8/00:08:08:a5:0c/00:00:00:00:00/e0 tag 0 dma 4096 in
[ 2931.722107] ata1.00: res 51/40:03:0c:a5:0c/00:00:00:00:00/e0 Emask 0x9 (media error)
[ 2931.777373] ata1.00: status: { DRDY ERR }
[ 2931.791188] ata1.00: error: { UNC }
[ 2931.828780] end_request: I/O error, dev sda, sector 828684
Ошибка шины
Deathrose
Что делать?
Проверьте кабели)) Проверьте жесткий диск)))
sht0rm
Заменить кабель, заменить жесткий диск.
Tupas
Ну, кабель заменить попробую. А чем можно жёсткий диск проверить?
sht0rm
666joy666
Было у меня точно так же…end-to-end error, если точней, можно посмотреть в SMART…
Решается заменой шлейфа, воткнуть шлейф в иной порт, взять иной кабель, и как апофез — сменить винт.
Pace!
Если это проблема с жёстким диском, то ведь она должна распространятся на всё, а не только на установку, так ведь?
666joy666
Если это проблема с жёстким диском, то ведь она должна распространятся на всё, а не только на установку, так ведь?
Эта ошибка не столь критична…у меня она вылазила только если я один файл пытался скопировать с одного раздела на иной, больше её не видел…
nd3
mhdd, victoria
Вы же в UBUNTU!!!
Проверить диск на битые сектора:
badblocks -v /dev/sda
MA3X
2931.791188] ata1.00: error: { UNC } — это открытым текстом сбойный сектор на харде.
невосстановимая ошибка чтения.
Винт или мучить mhdd, или менять. второе — предпочтительнее
Microsoft isn’t the answer.
Microsoft is the question, and the answer is NO.
Tupas
Нашлось 120 плохих блоков, и как их чинить?
nd3
Как чинить? badblocks -vw /dev/sda(1) это с проверкой на запись.
Внимание!!!! Вся информация будет уничтожена!
Сектора которые не пройдут тест запись-чтение, будут перенесены SMARTом в дефект лист. А по большому счету выход один — замена винчестера.
Tupas
Как чинить? badblocks -vw /dev/sda(1) это с проверкой на запись.
Внимание!!!! Вся информация будет уничтожена!
А без уничтожения никак что ли?
И это же наверное с другого диска делать надо?
nd3
Как чинить? badblocks -vw /dev/sda(1) это с проверкой на запись.
Внимание!!!! Вся информация будет уничтожена!А без уничтожения никак что ли?
И это же наверное с другого диска делать надо?
Никак, совсем. Это очевидно.
MA3X
Я допускаю для винта не более 5-7 бб на всей поверхности, чтобы считать его еще нормальным.
Если больше — то как минимум не в рабочие машины. Временное хранение некритичных данных.
А 120 — однозначно втопка_гореть.
Microsoft isn’t the answer.
Microsoft is the question, and the answer is NO.
- Печать
Страницы: [1] 2 Все Вверх
BPO | 15589 |
---|---|
Nosy | @loewis, @birkenfeld, @vstinner, @larryhastings, @ned-deily, @skrah |
Files |
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee = None closed_at = <Date 2012-08-12.09:55:38.356> created_at = <Date 2012-08-08.09:16:50.332> labels = ['type-crash'] title = 'Bus error on Debian sparc' updated_at = <Date 2012-08-12.09:55:38.354> user = 'https://github.com/skrah'
bugs.python.org fields:
activity = <Date 2012-08-12.09:55:38.354> actor = 'skrah' assignee = 'none' closed = True closed_date = <Date 2012-08-12.09:55:38.356> closer = 'skrah' components = [] creation = <Date 2012-08-08.09:16:50.332> creator = 'skrah' dependencies = [] files = ['26727'] hgrepos = [] issue_num = 15589 keywords = [] message_count = 21.0 messages = ['167678', '167679', '167701', '167706', '167713', '167714', '167715', '167716', '167717', '167718', '167723', '167724', '167725', '167728', '167733', '167735', '167736', '167737', '167777', '167805', '168030'] nosy_count = 8.0 nosy_names = ['loewis', 'georg.brandl', 'vstinner', 'larry', 'flub', 'ned.deily', 'skrah', 'python-dev'] pr_nums = [] priority = 'normal' resolution = 'wont fix' stage = 'resolved' status = 'closed' superseder = None type = 'crash' url = 'https://bugs.python.org/issue15589' versions = ['Python 3.3']
Copy link
Mannequin
Author
Running *any* test of the test suite currently produces a bus error
on Debian sparc [http://people.debian.org/~aurel32/qemu/sparc/].
After the bus error, the tests seem to proceed normally though.
This is definitely new. I’ve been testing memoryview for bus errors
a couple of months ago without problems.
Georg, I’m provisionally setting this to release blocker. The
qemu-sparc image is quite old though (Debian Etch). It’s a pity
we don’t have a sparc buildbot any more.
Example:
user@debian-sparc:~/cpython$ ./python -m test -uall -v test_flufl
== CPython 3.3.0b1 (default:67d36e8ddcfc+, Aug 7 2012, 23:49:57) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)]
Fatal Python error: Bus error
Current thread 0x00004000:
File «/home/user/cpython/Lib/subprocess.py», line 1363 in _execute_child
File «/home/user/cpython/Lib/subprocess.py», line 818 in __init__
File «/home/user/cpython/Lib/os.py», line 995 in popen
File «/home/user/cpython/Lib/platform.py», line 903 in _syscmd_uname
File «/home/user/cpython/Lib/platform.py», line 1147 in uname
File «/home/user/cpython/Lib/platform.py», line 1452 in platform
File «/home/user/cpython/Lib/test/regrtest.py», line 537 in main
File «/home/user/cpython/Lib/test/main.py», line 13 in <module>
File «/home/user/cpython/Lib/runpy.py», line 73 in _run_code
File «/home/user/cpython/Lib/runpy.py», line 160 in _run_module_as_main
== Linux-2.6.18-6-sparc32-sparc-with-debian-4.0 big-endian
== /home/user/cpython/build/test_python_3262
Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1)
[1/1] test_flufl
test_barry_as_bdfl (test.test_flufl.FLUFLTests) … ok
test_guido_as_bdfl (test.test_flufl.FLUFLTests) … ok
———————————————————————-
Ran 2 tests in 0.053s
OK
1 test OK.
Copy link
Mannequin
Author
Setting to critical: debian-sparc 32-bit is apparently deprecated
since Lenny and still uses linuxthreads.
Tracking down the failure could end up in finding a platform bug
like in bpo-12936.
From the position of the bus error, it would seem that calling a subprocess during platform.platform() is the culprit.
But if test_subprocess passes without any bus errors, that would be strange.
Is it by any chance a —shared build being run from the build directory without having been installed (and without a LD_LIBRARY_PATH and with an older version already installed)?
Running on Solaris 10 (T1000, OpenCSW toolchain, gcc 4.6.3) I also get a bus error, with added coredump:
$ ./python Lib/test/regrtest.py == CPython 3.3.0b1 (default:67a994d5657d, Aug 8 2012, 21:43:48) [GCC 4.6.3] == Solaris-2.10-sun4v-sparc-32bit big-endian == /export/home/flub/python/cpython/build/test_python_7320 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1) [ 1/369] test_grammar [ 2/369] test_opcodes [ 3/369] test_dict [ 4/369] test_builtin [ 5/369] test_exceptions test test_exceptions failed -- Traceback (most recent call last): File "/export/home/flub/python/cpython/Lib/test/test_exceptions.py", line 432, in testChainingDescriptors self.assertTrue(e.__suppress_context__) AssertionError: False is not true
[ 6/369/1] test_types
[ 7/369/1] test_unittest
[ 8/369/1] test_doctest
[ 9/369/1] test_doctest2
[ 10/369/1] test_support
[ 11/369/1] test___all__
[ 12/369/1] test___future__
[ 13/369/1] test__locale
[ 14/369/1] test__osx_support
[ 15/369/1] test_abc
[ 16/369/1] test_abstract_numbers
[ 17/369/1] test_aifc
[ 18/369/1] test_argparse
[ 19/369/1] test_array
[ 20/369/1] test_ast
[ 21/369/1] test_asynchat
[ 22/369/1] test_asyncore
[ 23/369/1] test_atexit
[ 24/369/1] test_audioop
[ 25/369/1] test_augassign
[ 26/369/1] test_base64
[ 27/369/1] test_bigaddrspace
[ 28/369/1] test_bigmem
[ 29/369/1] test_binascii
[ 30/369/1] test_binhex
[ 31/369/1] test_binop
[ 32/369/1] test_bisect
[ 33/369/1] test_bool
[ 34/369/1] test_buffer
[ 35/369/1] test_bufio
[ 36/369/1] test_bytes
[ 37/369/1] test_bz2
[ 38/369/1] test_calendar
[ 39/369/1] test_call
[ 40/369/1] test_capi
Fatal Python error: Bus error
Current thread 0x00000001:
File «/export/home/flub/python/cpython/Lib/test/test_capi.py», line 264 in test_skipitem
File «/export/home/flub/python/cpython/Lib/unittest/case.py», line 385 in _executeTestPart
File «/export/home/flub/python/cpython/Lib/unittest/case.py», line 440 in run
File «/export/home/flub/python/cpython/Lib/unittest/case.py», line 492 in __call__
File «/export/home/flub/python/cpython/Lib/unittest/suite.py», line 105 in run
File «/export/home/flub/python/cpython/Lib/unittest/suite.py», line 67 in __call__
File «/export/home/flub/python/cpython/Lib/unittest/suite.py», line 105 in run
File «/export/home/flub/python/cpython/Lib/unittest/suite.py», line 67 in __call__
File «/export/home/flub/python/cpython/Lib/test/support.py», line 1312 in run
File «/export/home/flub/python/cpython/Lib/test/support.py», line 1413 in _run_suite
File «/export/home/flub/python/cpython/Lib/test/support.py», line 1447 in run_unittest
File «/export/home/flub/python/cpython/Lib/test/test_capi.py», line 290 in test_main
File «Lib/test/regrtest.py», line 1219 in runtest_inner
File «Lib/test/regrtest.py», line 941 in runtest
File «Lib/test/regrtest.py», line 714 in main
File «Lib/test/regrtest.py», line 1810 in <module>
Bus Error (core dumped)
Not sure if this should be tracked in the same issue or not?
Copy link
Mannequin
Author
I think I’ve identified one legit Python bug. This is from a *different*
traceback, i.e. the traceback in my first message is still unresolved.
A bus error occurs in test_capi, test_skipitem with format ‘D’:
Python/getargs.c:782
Py_complex *p = va_arg(*p_va, Py_complex *); Py_complex cval; cval = PyComplex_AsCComplex(arg); if (PyErr_Occurred()) RETURN_ERR_OCCURRED; else *p = cval; <- bus error break;
The pointer p has value 0xefbfb1fc, with 0xefbfb1fc % 8 == 4. It originates
from a somewhat creatively allocated memory region in _testcapi:parse_tuple_and_keywords.
This platform is 8-byte aligned?
nm, I get it, doubles are 8-bytes and should be 8-byte aligned. Let me stare at it some more.
Copy link
Mannequin
Author
Floris, the traceback in my first message only occurs in the
optimized regular build with -O3. Did you try that, too?
Attached is a patch attempting to force double alignment. Stefan: please apply and try it. Does this help?
I compiled with a simple «./configure» which I think is what you mean (it defaults to -O3). But when executing your test it doesn’t give a bus error.
Copy link
Mannequin
Author
Larry Hastings <report@bugs.python.org> wrote:
Attached is a patch attempting to force double alignment. Stefan: please apply and try it. Does this help?
Yes, this works nicely.
I think I can confirm this fixes the BusError. The test suite got past test_capi on my machine as well. Unfortunately I killed the ssh session by accident before the testsuite completed so I had to restart it.
Copy link
Mannequin
Author
As for the original error: in test_subprocess basically every test
fails. With the standard regrtest.py (faulthandler enabled), most
tests generate a bus error in subprocess_fork_exec():
621 cwd_obj2 = NULL;
(gdb)
624 pid = fork(); <- bus error
(gdb)
Fatal Python error: Bus error
Current thread 0x00004000:
File «/home/user/cpython/Lib/subprocess.py», line 1363 in _execute_child
File «/home/user/cpython/Lib/subprocess.py», line 818 in __init__
File «/home/user/cpython/Lib/test/test_subprocess.py», line 728 in test_bufsize_is_none
621 cwd_obj2 = NULL;
(gdb)
624 pid = fork(); <- bus error
(gdb)
Fatal Python error: Bus error
Current thread 0x00004000:
File «/home/user/cpython/Lib/subprocess.py», line 1363 in _execute_child
File «/home/user/cpython/Lib/subprocess.py», line 818 in __init__
File «/home/user/cpython/Lib/test/test_subprocess.py», line 728 in test_bufsize_is_none
With all faulthandler references removed from regrtest.py no
bus errors happen, but most tests fail anyway. As I said, I’m
NOT blaming faulthandler, but suspect some strange platform
bug that perhaps involves linuxthreads.
Since Floris can’t reproduce this error, I’m setting the priority
to normal.
I can now confirm the whole testsuite runs, so the BusError part seems fixed on my host:
329 tests OK.
7 tests failed:
test_cmd_line test_exceptions test_ipaddress test_os test_raise
test_socket test_traceback
1 test altered the execution environment:
test_site
32 tests skipped:
test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp
test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu
test_epoll test_gdb test_kqueue test_lzma test_msilib
test_ossaudiodev test_pep277 test_readline test_smtpnet
test_socketserver test_sqlite test_ssl test_startfile test_tcl
test_timeout test_tk test_ttk_guionly test_ttk_textonly
test_unicode_file test_urllib2net test_urllibnet test_winreg
test_winsound test_xmlrpc_net test_zipfile64
8 skips unexpected on sunos5:
test_lzma test_readline test_smtpnet test_ssl test_tcl test_tk
test_ttk_guionly test_ttk_textonly
Copy link
Mannequin
Author
329 tests OK.
7 tests failed:
test_cmd_line test_exceptions test_ipaddress test_os test_raise
test_socket test_traceback
Thanks. A lot of these appear to be big-endian related, see bpo-15597.
With all faulthandler references removed from regrtest.py no
bus errors happen, but most tests fail anyway. As I said, I’m
NOT blaming faulthandler, but suspect some strange platform
bug that perhaps involves linuxthreads.
Threads + signal is a very complex problem. It is not solved yet in OpenBSD for example. There were a lot of such issues on old versions of FreeBSD. Extract of the Wikipedia article of LinuxThreads:
«LinuxThreads had a number of problems, mainly owing to the implementation, which used the clone system call to create a new process sharing the parent’s address space. For example, threads had distinct process identifiers, causing problems for signal handling; (…)»
If disabling faulthandler avoids new issues, you can add ‘if sys.thread_info.version.startswith(«linuxthreads»):» on the line:
faulthandler.enable(all_threads=True)
in regrtest.py.
I added sys.thread_info to be able to skip some tests only failing on LinuxThreads…
—
but most tests fail anyway
Ah? With which message? Can you get more information in gdb?
Copy link
Mannequin
Author
If disabling faulthandler avoids new issues, you can add ‘if
[not] sys.thread_info.version.startswith(«linuxthreads»)’
That suppresses some bus errors. However, they still occur without
being raised (some print statements and a WIFSIGNALED test inserted
in posix_waitpid):
>>> import subprocess, os >>> p = subprocess.Popen(["/bin/true"]) >>> os.waitpid(p.pid, os.WNOHANG) pid: 4461 options: 1 signo: 10 (4461, 10) >>>
So a bus error occurs in waitpid(pid, &status, options).WAIT_TYPE
is int, perhaps that’s incorrect for the platform, but I can’t get
hold of the posix man pages for debian-etch-sparc.
I’d like to urge everybody to focus at one issue at a time. This issue is about Python crashing on a SparcLinux qemu image, so I think it should have priority «low» — there is absolutely no requirement that this needs to work.
As for the test failures on Solaris — please report them as separate issues (one per failure, «normal» priority seems right).
Copy link
Mannequin
Author
Closing since the remaining issue is almost certainly a platform bug.
mmap
минимальный пример POSIX 7
«Ошибка шины» происходит, когда ядро отправляет SIGBUS
в процесс.
Минимальный пример, который создает его, потому что ftruncate
был забыт:
#include <fcntl.h> /* O_ constants */
#include <unistd.h> /* ftruncate */
#include <sys/mman.h> /* mmap */
int main() {
int fd;
int *map;
int size = sizeof(int);
char *name = "/a";
shm_unlink(name);
fd = shm_open(name, O_RDWR | O_CREAT, (mode_t)0600);
/* THIS is the cause of the problem. */
/*ftruncate(fd, size);*/
map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
/* This is what generates the SIGBUS. */
*map = 0;
}
Запустить с помощью:
gcc -std=c99 main.c -lrt
./a.out
Протестировано в Ubuntu 14.04.
POSIX описывает SIGBUS
как:
Доступ к части undefined объекта памяти.
спецификация mmap говорит, что:
Ссылки в диапазоне адресов, начинающиеся с pa и продолжающиеся для len-байтов на целые страницы, следующие за концом объекта, должны привести к передаче сигнала SIGBUS.
И shm_open
говорит, что он генерирует объекты размером 0:
Объект общей памяти имеет нулевой размер.
Итак, при *map = 0
мы касаемся конца выделенного объекта.
- How do I fix PCIe bus error severity?
- What causes PCIe errors?
- What is PCI Noaer?
- What is PCI parity error?
- What is PCI Express x16?
- What is malformed TLP in PCIe?
- How does PCIe Aer work?
- How do I install PCI Nomsi?
- What is Grub_cmdline_linux_default?
- What is quiet splash?
- How do I fix parity error?
- How do I fix PCI error?
How do I fix PCIe bus error severity?
To do that, you need to edit the grub configuration. Basically, you just have to use a text editor for editing the file. Restart Ubuntu and you shouldn’t see the ‘PCIe Bus Error severity Corrected messages’ anymore. If this doesn’t fix the issue for you, you can try to change other kernel parameters.
What causes PCIe errors?
It may be a hardware bug in the device, in the PCIe root controller on the motherboard, in the specific interaction of those two, or something else. By using pci=nommconf , the configuration space of all devices will be accessed in the original way, and changing the access methods works around this problem.
What is PCI Noaer?
pci=noaer = no pci advanced error reporting .
What is PCI parity error?
If your PCI ports are incorrectly or loosely connected due to dust or if your PCI ports are completely corrupted unexpectedly, your computer will not be able to reach and read the connected hardware and finally give you such error message.
What is PCI Express x16?
PCIe (peripheral component interconnect express) is an interface standard for connecting high-speed components. … Most GPUs require a PCIe x16 slot to operate at their full potential.
What is malformed TLP in PCIe?
Malformed packets :
PCIe defines the transaction rules at each layer. Any transaction/packet violating these rules considered as malformed TLP. Examples: Data payload exceeds max payload size, the actual data length does not match data length specified in the header, TC to VC Mapping violation/errors.
How does PCIe Aer work?
When AER is enabled, a PCI Express device will automatically send an error message to the PCIe root port above it when the device captures an error. The Root Port, upon receiving an error reporting message, internally processes and logs the error message in its PCI Express capability structure.
How do I install PCI Nomsi?
Follow these steps precisely
- Start your system as your traditional way Then open terminal( Ctrl + Alt + T ) and execute these commands: sudo cp /etc/default/grub /etc/default/grub.bak sudo gedit /etc/default/grub. …
- Update grub and restart your system: sudo update-grub sudo reboot.
What is Grub_cmdline_linux_default?
GRUB_CMDLINE_LINUX_DEFAULT=»quiet splash» This line imports any entries to the end of the ‘linux’ line (GRUB legacy’s «kernel» line). The entries are appended to the end of the normal mode only. To view a black screen with boot processes displayed in text, remove «quiet splash».
What is quiet splash?
From Unix & Linux, on quiet splash : The splash (which eventually ends up in your /boot/grub/grub. cfg ) causes the splash screen to be shown. At the same time you want the boot process to be quiet, as otherwise all kinds of messages would disrupt that splash screen.
How do I fix parity error?
Parity errors offset the charge value and can bring back invalid or incorrect commands for the computer.
- Correct Electrical Source Problems.
- Remove ESD and EMI Sources.
- Adjust RAM Timing.
- Remove or Replace RAM Modules.
How do I fix PCI error?
How do I fix PCI BUS DRIVER INTERNAL errors?
- Update your drivers.
- Update Windows 10.
- Run the Hardware Troubleshooter.
- Run the SFC scan.
- Run DISM.
- Remove overclock settings.
- Remove problematic software.
- Reset Windows 10.
I installed Ubuntu 18.04 today and I noticed the same problem. I’ve just installed that package and problem has been solved.
sudo apt-get install busybox-syslogd
Check log files size and do empty large files:
ls -s -S /var/log
result:
total 4352668
4021088 syslog 32 wtmp 4 gdm3
329168 kern.log 24 Xorg.0.log 4 hp
1776 dpkg.log 20 Xorg.1.log 4 installer
40 lastlog 20 Xorg.0.log.old 4 journal
and do:
cd /var/log
sudo su
$ > syslog
$ > kern.log
Then, to make sure, let follow this answer above https://askubuntu.com/a/1019225/725320
In case you can’t boot into Ubuntu and get stuck with these logs in your screen (same as me):
Dec 19 17:31:01 andrew kernel: [ 99.027473] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
Dec 19 17:31:01 andrew kernel: [ 99.027474] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
Dec 19 17:31:01 andrew kernel: [ 99.027475] pcieport 0000:00:1c.5: [ 0] Receiver Error
Dec 19 17:31:01 andrew kernel: [ 99.027479] pcieport 0000:00:1c.5: AER: Corrected error received: id=00e5
Dec 19 17:31:01 andrew kernel: [ 99.027826] pcieport 0000:00:1c.5: can't find device of ID00e5
Dec 19 17:31:01 andrew kernel: [ 99.027887] pcieport 0000:00:1c.5: AER: Multiple Corrected error received: id=00e5
- Use Recovery Mode to get
root shell
- Do empty large log files
- Boot into Ubuntu, install
busybox-syslogd
and updategrub
config