[PTLsim-devel] Internal Loads / Stores (LDP / STP) vs. cache hierarchy

Stephan Diestelhorst
Fri Jul 6 08:33:53 EDT 2007


Sorry, that was a stupid mistake on my side (see below)!

> Well, it seems like there is some problem... Regardless of whether I try it
> with the new modifications (see previous mail) for internal loads/stores or
> with the standard original behaviour, I see strange effects:
>
> I have two instructions, which (for testing purposes) increment and
> retrieve an internal counter, such as this:

> Testing this with code like that:
>
> void PTLSsim_inc_get_loop() {
>     long asf_counter;
>     int i;
>     for (i = 0; i < 10; i++) {
>             asm (
>                     INC
>                     REX64 GET(REG_CX)
>                     :"=c"(counter)
>             );
>             printf("i: %u counter: %llu\n", i, counter);
>     }
> }
>
> Results in:
> i: 0 counter: 1
> i: 1 counter: 1
> i: 2 counter: 1
> i: 3 counter: 1
> i: 4 counter: 1

The optimising compiler assumed no magic side effects from the instructions 
and thus moved the assembly code out of the loop. With an added volatile 
keyword, everything works as expected.

Apologies for spamming the list!

Sorry,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst at amd.com, Tel.   (AMD: 8-4903)




More information about the PTLsim-devel mailing list