[PTLsim-devel] Domain crashes when swithcing back and forth between native & simulation.
Stephan Diestelhorst
Sat Oct 13 13:13:05 EDT 2007
Hi,
I have to bother you guys again...
I have a small test-case where a programm creates a few pthreads which just
spin and wait for some shared variable to change. That variable is changed
(by the main thread) after a certain amount of time has elapsed and then the
thread waits for its 'children' with pthread_join.
(If you're curious, it is some stripped down tinySTM intset benchmark. It is
available online)
In order to avoid the overhead, I switch to simulation mode just after thread
creation and have experimented with switching back before and after the join.
Regardless of this however, my domain crashes when I try to invoke the testing
tool again from command line (a script calls it and changes a command line
parameter, which defines the number of threads to create).
This is with an almost unmodified PTLsim of R219, with a few patches applied
to increase the number of physical registers to allow 4 SMTs and most of my
patches regarding internal stores and prefetches (posted to the list
previously!)
I use SMTcore, with defined ENABLE_SMT. This is without any dirty tricks such
as SMP or MOESI.
The first few executions of the test run fine, but usually the second / third
one crashes the domain. I have tried to insert a sleep command between each
subsequent executions of the benchmark, as it seemed to work when I started
the test tool by hand, rather than using a script. Logs below...
All domains have been pinned with phys_core = vcpu_id.
Is this a known issue? If so, would upgrading to a newer version help?
I'm wondering why PTLsim keeps printing the warnings about the state it is in.
Something doesn't seem to be quite right there...
I have tried to find out something about the addresses, Xen is complaing about
(i.e. last fixup and RIP): Neither of them is found anywhere, ie not on the
guest's kernel, nor inside my testtool (at least not with plain objdump,
perhaps some relocation has taken place..)
Any clues on this?
Many thanks!
Stephan
DomU (domain being simulated):
----------------------------------------------
Testing 1 cores with _empty.
Creating thread 0
Switching to core smt...
DONE
STARTING...
Test-thread finished!
STOPPING...
Waiting for thread 0
.. finished!
Switching to native...
DONE
Testing 2 cores with _empty.
Creating thread 0
Creating thread 1
Switching to core smt...
DONE
STARTING...
Test-thread finished!
STOPPING...
Test-thread finished!
Waiting for thread 0
.. finished!
Waiting for thread 1
.. finished!
Switching to native...
DONE
Testing 3 cores with _empty.
Set type : linked list
Duration : 1
Initial size : 256
Nb threads : 3
Value range : 65535
Seed : 0
Update rate : 20
Type sizes : int=4/long=8/ptr=8/word=8
Initializing STM
Adding 256 entries to set
Set size : 256
Creating thread 0
Creating thread 1
Creating thread 2
Switching to core smt...
PTLsim (logfile attached, loglevel 1):
-----------
ptlsim_bugfixes -domain ptlvm-smp -loglevel 1 -native
//
// PTLsim: Cycle Accurate x86-64 Full System SMP/SMT Simulator
// Copyright 1999-2007 Matt T. Yourst <yourst at yourst.com>
//
// Revision 219 (05:35:58)
// Built Oct 12 2007 19:10:02 on mautern.amd.com using gcc-4.1
// Running on merill.
//
Waiting for request...
Processing -domain ptlvm-smp -loglevel 1 -native
Warning: invalid value 'ptlvm-smp' for option -domain; ignoring
System Information:
Running on hypervisor version xen-3.0-x86_64 xen-3.0-x86_32p h
Xen is mapped at virtual address 0xffff800000000000
PTLsim is running across 4 VCPUs:
Physical CPU type: AMD K8 (Opteron, Athlon 64, Turion)
VCPU 0 core frequency: 1895 MHz
Physical CPU affinity for all VCPUs: 0 1 2 3
Memory Layout:
System: 4390912 pages, 17563648 KB
Domain: 262144 pages, 1048576 KB
PTLsim reserved: 32768 pages, 131072 KB
Page Tables: 356 pages, 1424 KB
PTLsim image: 575 pages, 2300 KB
Heap: 31837 pages, 127348 KB
Stack: 256 pages, 1024 KB
Interfaces:
PTLsim page table: 61762
Shared info mfn: 4040
Shadow shinfo mfn: 16401
PTLsim hostcall: event channel 3
PTLsim upcall: event channel 4
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
Returned from switch to native: now back in sim
Waiting for request...
Processing -native
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
Returned from switch to native: now back in sim
Waiting for request...
Processing -core smt -run
Switching to simulation core 'smt'...
Stopping after 9223372036854775807 commits
Completed 22416358 cycles, 21375022 commits: 196266 cycles/sec,
176516, insns/sec: rip 0xffffffff8011918e 0xffffffff801063aa
0xffffffff801063aa 0xffffffff80288fde
Stopped after 22419601 cycles and 21377523 instructions
Completed 22419601 cycles, 21377523 commits: 10 cycles/sec,
8, insns/sec: rip 0xffffffff801b7d45 0xffffffff801063aa 0xffffffff801063aa
0x4016a7
Waiting for request...
Processing -native
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
Returned from switch to native: now back in sim
Waiting for request...
Processing -native
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
ptlmon: Warning: cannot switch to simulation mode: domain 1 was already in
state 2
Returned from switch to native: now back in sim
Waiting for request...
Processing -core smt -run
Switching to simulation core 'smt'...
Stopping after 9223372036854775807 commits
Completed 3762928 cycles, 26188714 commits: 143183 cycles/sec,
168038, insns/sec: rip 0xffffffff80284ef7 0xffffffff801885fc
0xffffffff801063aa 0xffffffff8016a5b9
Stopped after 3786551 cycles and 26213841 instructions
Completed 3786551 cycles, 26213841 commits: 73 cycles/sec,
78, insns/sec: rip 0xffffffff80284ecd 0xffffffff801063aa 0xffffffff801063aa
0x4016a7
Waiting for request...
Processing -native
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
Returned from switch to native: now back in sim
Waiting for request...
Processing -native
Switched to native mode
Breakout request received from native mode
Switched to simulation mode
ptlmon: Warning: cannot switch to simulation mode: domain 1 was already in
state 2
Xen
-----
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 0, rip
00000000004016a7, rsp 00007fffffffe1f0, ksp ffff88003fdfa000, cr3mfn 16955,
eflags 0000000000010246
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
000000000004d3d0, rsp 0000000007fbcba0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9efd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9ffd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007ea0fd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) printk: 4 messages suppressed.
(XEN) mm.c:1739:d1 Bad type (saw 00000000a8000001 != exp 00000000e0000000) for
mfn 40d53c (pfn 8ad)
(XEN) mm.c:625:d1 Error getting mfn 40d53c (pfn 8ad) from L1 entry
000000040d53c107 for dom1
(XEN) mm.c:1739:d1 Bad type (saw 00000000a8000001 != exp 00000000e0000000) for
mfn 40d53f (pfn 8aa)
(XEN) mm.c:625:d1 Error getting mfn 40d53f (pfn 8aa) from L1 entry
000000040d53f107 for dom1
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffffffff803e9f50, ksp ffffffff803ea000, cr3mfn 4248815,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
ffffffff80143872, rsp ffff88003f5f7f08, ksp ffff88003f5f8000, cr3mfn 16956,
eflags 0000000000000282
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffff88000000df08, ksp ffff88000000e000, cr3mfn 4248815,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 0, rip
00000000004017fe, rsp 00007fffffffe1f0, ksp ffff88003f2f8000, cr3mfn 16955,
eflags 0000000000010246
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
000000000004d3d8, rsp 0000000007fbcba0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9efd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9ffd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007ea0fd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
ffffffff80284ecd, rsp ffffffff80424eb8, ksp ffffffff803ea000, cr3mfn 16956,
eflags 0000000000000247
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffff880000471f08, ksp ffff880000472000, cr3mfn 16956,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffff88000000df08, ksp ffff88000000e000, cr3mfn 4248815,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 0, rip
00000000004016a7, rsp 00007fffffffe1f0, ksp ffff88003f2f8000, cr3mfn 16955,
eflags 0000000000010246
(XEN) printk: 118 messages suppressed.
(XEN) mm.c:1739:d0 Bad type (saw 00000000e8000001 != exp 0000000080000000) for
mfn 423b (pfn 3fde9)
(XEN) Error installing user page table base mfn 16955 in domain 1 vcpu 0
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
000000000004d3d8, rsp 0000000007fbcba0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9efd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9ffd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007ea0fd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) mm.c:1739:d1 Bad type (saw 00000000a8000001 != exp 00000000e0000000) for
mfn 40d53e (pfn 8ab)
(XEN) mm.c:625:d1 Error getting mfn 40d53e (pfn 8ab) from L1 entry
000000040d53e107 for dom1
(XEN) mm.c:1739:d1 Bad type (saw 0000000098000001 != exp 00000000e0000000) for
mfn 4d2e (pfn 3f2f6)
(XEN) mm.c:625:d1 Error getting mfn 4d2e (pfn 3f2f6) from L1 entry
0000000004d2e107 for dom1
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffffffff803e9f50, ksp ffffffff803ea000, cr3mfn 19758,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffff880000471f08, ksp ffff880000472000, cr3mfn 19758,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 0, rip
00000000004017fe, rsp 00007fffffffe1f0, ksp ffff88003fdea000, cr3mfn 19757,
eflags 0000000000010246
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 1, rip
ffffffff801063aa, rsp ffff88000000ff08, ksp ffff880000010000, cr3mfn 4248815,
eflags 0000000000000246
(XEN) arch_finish_context_swap(vcpu 0, domain 1): kernel? 1, rip
0000000000045000, rsp 0000000007fbd000, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000200
(XEN) arch_finish_context_swap(vcpu 1, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9efd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) arch_finish_context_swap(vcpu 2, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007e9ffd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000202
(XEN) arch_finish_context_swap(vcpu 3, domain 1): kernel? 1, rip
00000000000113aa, rsp 0000000007ea0fd0, ksp 0000000000000000, cr3mfn 61762,
eflags 0000000000000206
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 1 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]----
(XEN) Trap: invalid opcode (6)
(XEN) Error code 00000000
(XEN) Last fixup: rip ffff8300001402f6, cr2 ffff88000202f060
(XEN) Guest VCPU flags: 1
(XEN) CPU: 0
(XEN) RIP: e033:[<0000000000051cc8>]
(XEN) RFLAGS: 0000000000000282 CONTEXT: guest
(XEN) rax: 0000000000000003 rbx: 0000000007fbcd10 rcx: 0000000047110ed3
(XEN) rdx: 0000000000000003 rsi: 0000000007fbcd10 rdi: 0000000000000000
(XEN) rbp: 0000000000000000 rsp: 0000000007fbccb8 r8: 0000000000003000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000007b0
(XEN) cr3: 000000000f142000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=0000000007fbccb8:
(XEN) 0000000047110ed3 0000000000000000 0000000000051cc8 000000010000e030
(XEN) 0000000000010082 0000000007fbccf0 000000000000e02b 0000000007fbcea0
(XEN) 000000000004e55b 0000000000000007 0000000000000040 0000002600000000
(XEN) 000000000004da91 000000000018e1e0 0000000000000000 0000000000000004
(XEN) 0000000000183c20 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000051ed6 0000000000000000 0000000000000000 0000000000000000
(XEN) 0001000000000000 0000000000000033 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000033 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
--
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst at amd.com, Tel. (AMD: 8-4903)
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden,
Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington,
Delaware, USA)
Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ptlsim.log_dom_breaks_after_switch.zip
Type: application/x-zip
Size: 61520 bytes
Desc: not available
Url : https://ptlsim.org/pipermail/ptlsim-devel/attachments/20071013/5d2f53ee/attachment-0001.bin
More information about the PTLsim-devel mailing list