Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

litedram with vexriscv DDR4 SODIMM fails memtest (Xilinx VU9P + spd) #349

Open
jersey99 opened this issue Oct 5, 2023 · 5 comments
Open

Comments

@jersey99
Copy link
Contributor

jersey99 commented Oct 5, 2023

Hi All,

I am trying to bring up DDR4 on htg-940 board using litedram (I managed to get the spd dump over I2C). I feel like I have made some progress but seemed to have hit a dead-end for now. The memtest fails on about half the data (50 or 75% data errors). I would appreciate if some more experienced DDR4 personnel had a look at the memtest log or pin settings to see if they can give some feedback here.

litex> reboot                                                                                                                                                   

       __   _ __      _  __                                                                                                                                                     
      / /  (_) /____ | |/_/ 
     / /__/ / __/ -_)>  <                                                                                                                                                                                          
    /____/_/\__/\__/_/|_|                                                                                                                                                                                          
  Build your hardware, easily!  

(c) Copyright 2012-2023 Enjoy-Digital         
(c) Copyright 2007-2015 M-Labs
BIOS built on Oct  5 2023 21:40:13
BIOS CRC passed (b723a448)                                                                                                                                                                                             

LiteX git sha1: 98eb27df                                                                                                                                                                                       

--=============== SoC ==================--
CPU:            VexRiscv @ 125MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data                                                                                                                                                                                             
ROM:            128.0KiB                                                                                                                                                                                    
SRAM:           8.0KiB                                                                                                                                                                              
L2:             8.0KiB                                                                                                                                                                              
SDRAM:          8.0GiB 64-bit @ 1000MT/s (CL-9 CWL-9)
MAIN-RAM:       1.0GiB                                                                                                                                                                              

--========== Initialization ============-- 
Initializing SDRAM @0x40000000... 
Switching SDRAM to software control. 
Write leveling:                                                                                                                                                         
 tCK equivalent taps: 604                                                                                                                                                                                          
 Cmd/Clk scan (0-302)                                                                                                                                                                              
 |00011  |011111111  |111111111  |111111111| best: 188
 Setting Cmd/Clk delay to 188 taps.
 Data scan:                                                                                                                                                
 m0: |11111111111110000000000| delay: 00                     
 m1: |11111111111111100000000| delay: 00                               
 m2: |11111111111111110000000| delay: 00                               
 m3: |11111111111111110000000| delay: 00                               
 m4: |00001111111111111111111| delay: 57                                 
 m5: |00001111111111111111111| delay: 58                                 
 m6: |00000111111111111111111| delay: 66                                 
 m7: |00001111111111111111111| delay: 54
Write latency calibration:                                                                                                                                                                                          
m0:6 m1:6 m2:6 m3:6 m4:6 m5:6 m6:6 m7:6                                                                                                                                                                                  ?  ? ? ? ? ? ? ?                                           
Read leveling:                                                                                                                                                      
 m0, b00: |00000000000000000000000000000000| delays: -                                                                                                                                                                  
 m0, b01: |00000000000000000000000000000000| delays: -                                                                                                                                                                     m0, b02: |00000000000000000000000000000000| delays: -                                                                                                                                                                     m0, b03: |11111111111110000000000000000000| delays: 100+-100
 m0, b04: |00000000000000001111111111111111| delays: 371+-130
 m0, b05: |00000000000000000000000000000000| delays: -
 m0, b06: |00000000000000000000000000000000| delays: -
 m0, b07: |00000000000000000000000000000000| delays: -
 best: m0, b04 delays: 371+-130
 m1, b00: |00000000000000000000000000000000| delays: -
 m1, b01: |00000000000000000000000000000000| delays: -
 m1, b02: |00000000000000000000000000000000| delays: -
 m1, b03: |11111111111111000000000000000000| delays: 111+-111
 m1, b04: |00000000000000000111111111111111| delays: 387+-124
 m1, b05: |00000000000000000000000000000000| delays: -
 m1, b06: |00000000000000000000000000000000| delays: -
 m1, b07: |00000000000000000000000000000000| delays: -
 best: m1, b04 delays: 387+-124
 m2, b00: |00000000000000000000000000000000| delays: -
 m2, b01: |00000000000000000000000000000000| delays: -
 m2, b02: |00000000000000000000000000000000| delays: -
 m2, b03: |11111111100000000000000000000000| delays: 69+-69
 m2, b04: |00000000000011111111111111111000| delays: 319+-126
 m2, b05: |00000000000000000000000000000001| delays: 503+-08
 m2, b06: |00000000000000000000000000000000| delays: -
 m2, b07: |00000000000000000000000000000000| delays: -
 best: m2, b04 delays: 318+-127
 m3, b00: |00000000000000000000000000000000| delays: -
 m3, b01: |00000000000000000000000000000000| delays: -
 m3, b02: |00000000000000000000000000000000| delays: -
 m3, b03: |11111111000000000000000000000000| delays: 59+-59
 m3, b04: |00000000000111111111111111100000| delays: 294+-128
 m3, b05: |00000000000000000000000000000011| delays: 492+-18
 m3, b06: |00000000000000000000000000000000| delays: -
 m3, b07: |00000000000000000000000000000000| delays: -
 best: m3, b04 delays: 293+-127
 m4, b00: |00000000000000000000000000000000| delays: -
 m4, b01: |00000000000000000000000000000000| delays: -
 m4, b02: |00000000000000000000000000000000| delays: -
 m4, b03: |11111110000000000000000000000000| delays: 50+-50
 m4, b04: |00000000001111111111111111000000| delays: 274+-126
 m4, b05: |00000000000000000000000000000111| delays: 481+-30
 m4, b06: |00000000000000000000000000000000| delays: -
 m4, b07: |00000000000000000000000000000000| delays: -
 best: m4, b04 delays: 274+-126
 m5, b00: |00000000000000000000000000000000| delays: -
 m5, b01: |00000000000000000000000000000000| delays: -
 m5, b02: |00000000000000000000000000000000| delays: -
 m5, b03: |11111110000000000000000000000000| delays: 52+-52
 m5, b04: |00000000001111111111111111000000| delays: 273+-129
 m5, b05: |00000000000000000000000000001111| delays: 480+-30
 m5, b06: |00000000000000000000000000000000| delays: -
 m5, b07: |00000000000000000000000000000000| delays: -
 best: m5, b04 delays: 274+-129
 m6, b00: |00000000000000000000000000000000| delays: -
 m6, b01: |00000000000000000000000000000000| delays: -
 m6, b02: |00000000000000000000000000000000| delays: -
 m6, b03: |11000000000000000000000000000000| delays: 14+-14
 m6, b04: |00000111111111111111100000000000| delays: 202+-128
 m6, b05: |00000000000000000000000011111111| delays: 445+-65
 m6, b06: |00000000000000000000000000000000| delays: -
 m6, b07: |00000000000000000000000000000000| delays: -
 best: m6, b04 delays: 202+-127
 m7, b00: |00000000000000000000000000000000| delays: -
 m7, b01: |00000000000000000000000000000000| delays: -
 m7, b02: |00000000000000000000000000000000| delays: -
 m7, b03: |10000000000000000000000000000000| delays: 06+-06
 m7, b04: |00001111111111111111000000000000| delays: 184+-129
 m7, b05: |00000000000000000000000111111111| delays: 433+-77
 m7, b06: |00000000000000000000000000000000| delays: -
 m7, b07: |00000000000000000000000000000000| delays: -
 best: m7, b04 delays: 185+-129
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
 Write: 0x40000000-0x40200000 2.0MiB     
  Read: 0x40000000-0x40200000 2.0MiB     
 bus errors:  0/256
 addr errors: 0/8192
 data errors: 262144/524288
Memtest KO
Memory initialization failed

--============= Console ================--

@Jurisu25
Copy link

Jurisu25 commented Oct 6, 2023

Hello, I happen to have this board as well. Can you tell me the commands you used and some configuration parameters? (The more detailed the better) I may be able to answer your question

@jersey99
Copy link
Contributor Author

jersey99 commented Oct 6, 2023

Thanks @sususjysjy .. I have 4GB micron DDR installed. My sys_clk_freq (125MHz) I may need to try a different sys_clk_freq, or adjust the speedgrade timings accordingly, but I just use the spd.dump. How did you instantiate the core?

    ("ddram", 0,
     Subsignal("a", Pins(
         "BD40 BB35 BE40 BD34 BF40 BC39 BC34 BD39",
         "BD35 BE35 BA33 BF39 BD36 AV34"),  # AW33 AY33 AW36"),
         IOStandard("SSTL12_DCI")),
     Subsignal("ba", Pins("BA35 AY36"), IOStandard("SSTL12_DCI")),
     Subsignal("bg", Pins("BE36 BF37"), IOStandard("SSTL12_DCI")),
     Subsignal("we_n", Pins("AW33"), IOStandard("SSTL12_DCI")),   # A14
     Subsignal("cas_n", Pins("AY33"), IOStandard("SSTL12_DCI")),  # A15
     Subsignal("ras_n", Pins("AW36"), IOStandard("SSTL12_DCI")),  # A16
     Subsignal("act_n", Pins("BB38"), IOStandard("SSTL12_DCI")),
     # Subsignal("alert_n", Pins("BE38"), IOStandard("SSTL12_DCI")),
     Subsignal("cs_n", Pins("AY35 AV33"), IOStandard("SSTL12_DCI")), #  AW34 AU34
     #Subsignal("par", Pins("BF35"), IOStandard("SSTL12_DCI")),
     Subsignal("reset_n", Pins("BC38"), IOStandard("LVCMOS12")),
     Subsignal("cke", Pins("BE37 BF38"), IOStandard("SSTL12_DCI")),
     Subsignal("clk_p", Pins("BB36 BB37"), IOStandard("DIFF_SSTL12_DCI")),
     Subsignal("clk_n", Pins("BC36 BC37"), IOStandard("DIFF_SSTL12_DCI")),
     # Subsignal("cke", Pins("BE37"), IOStandard("SSTL12_DCI")),
     # Subsignal("clk_p", Pins("BB36"), IOStandard("DIFF_SSTL12")),
     # Subsignal("clk_n", Pins("BC36"), IOStandard("DIFF_SSTL12")),
     Subsignal("odt", Pins("AW35 AT34"), IOStandard("SSTL12_DCI")),
     Subsignal("dm", Pins("AH34 AJ27 AA32 AE31 BC31 AW29 BF32 AP31"), #  AT33
               IOStandard("POD12_DCI")),
     Subsignal("dq", Pins(
         "AF33 AG34 AH33 AJ33 AF34 AF32 AG32 AG31",
         "AK31 AG30 AJ29 AK28 AJ31 AJ30 AJ28 AG29",
         "Y33 W33 W30 AA34 Y32 W34 Y30 AB34",
         "AD34 AF30 AD33 AC32 AE30 AE33 AC34 AC33",
         "AY32 BA30 BB29 BB30 AY30 AY31 BA29 BB31",
         "AV31 AW31 AU30 AT29 AU32 AV32 AU31 AT30",
         "BD33 BE31 BD29 BF30 BE32 BE33 BC29 BE30",
         "AN29 AP29 AN31 AL30 AR30 AP30 AL29 AM31"),
         #"AM34 AP34 AM32 AP33 AL34 AN34 AL32 AR33"),
        IOStandard("POD12_DCI")),
     Subsignal("dqs_p", Pins("AH31 AH28 W31 AC31 BA32 AU29 BD30 AM29"), # AN32
               IOStandard("DIFF_POD12")),
     Subsignal("dqs_n", Pins("AH32 AH29 Y31 AD31 BB32 AV29 BD31 AM30"), # AN33
               IOStandard("DIFF_POD12")),
     Misc("SLEW=FAST"),
     ),
class MTA8ATF51264HZ(DDR4Module):
    # geometry
    ngroupbanks = 4
    ngroups     = 4
    nbanks      = ngroups * ngroupbanks
    nrows       = 32768
    ncols       = 1024
    # timings
    trefi = {"1x": 64e6/8192,   "2x": (64e6/8192)/2, "4x": (64e6/8192)/4}
    trfc  = {"1x": (None, 350), "2x": (None, 260),   "4x": (None, 160)}
    technology_timings = _TechnologyTimings(tREFI=trefi, tWTR=(4, 7.5), tCCD=(4, None), tRRD=(4, 4.9), tZQCS=(128, 80))
    speedgrade_timings = {
        "2133": _SpeedgradeTimings(tRP=15, tRCD=15, tWR=15, tRFC=trfc, tFAW=(20, 25), tRAS=33),
    }
    speedgrade_timings["default"] = speedgrade_timings["2133"]

        # sys_clk_freq = 125e6
        if not self.integrated_main_ram_size:
            self.ddrphy = usddrphy.USPDDRPHY(platform.request("ddram"),
                memtype          = "DDR4",
                sys_clk_freq     = sys_clk_freq,
                iodelay_clk_freq = sys_clk_freq * 4)
            if spd_dump is not None:
                ram_spd = parse_spd_hexdump(spd_dump)
                ram_module = SDRAMModule.from_spd_data(ram_spd, sys_clk_freq)
                print(f"configuring DDR4 from file: {spd_dump}")
            else:
                ram_module = MTA8ATF51264HZ(sys_clk_freq, "1:4")
            self.add_sdram("sdram",
                phy           = self.ddrphy,
                module        = ram_module,
                size          = 0x40000000,
                l2_cache_size = kwargs.get("l2_size", 8192)
            )

@Jurisu25
Copy link

Jurisu25 commented Oct 7, 2023

I have solved this problem and you need to change the module to MT40A512M16.

--=============== SoC ==================--
CPU:            VexRiscv SMP-LINUX @ 125MHz
BUS:            WISHBONE 32-bit @ 4GiB
CSR:            32-bit data
ROM:            64.0KiB
SRAM:           6.0KiB
L2:             2.0KiB
SDRAM:          4.0GiB 64-bit @ 1000MT/s (CL-9 CWL-9)
MAIN-RAM:       1.0GiB

--========== Initialization ============--
Initializing SDRAM @0x40000000...
Switching SDRAM to software control.
Write leveling:
  tCK equivalent taps: 420
  Cmd/Clk scan (0-210)
  |0111  |011111111  |100111011  |001111111| best: 74
  Setting Cmd/Clk delay to 74 taps.
  Data scan:
  m0: |11110000000000000111111111| delay: 271
  m1: |11111000000000000011111111| delay: 287
  m2: |11111100000000000001111111| delay: 300
  m3: |11111110000000000000111111| delay: 00
  m4: |11111111100000000000001111| delay: 00
  m5: |11111111110000000000000111| delay: 00
  m6: |11111111111100000000000001| delay: 00
  m7: |11111111111100000000000001| delay: 00
Write latency calibration:
m0:0 m1:0 m2:0 m3:6 m4:6 m5:6 m6:6 m7:6
Read leveling:
  m0, b00: |00000000000000000000000000000000| delays: -
  m0, b01: |00000000000000000000000000000000| delays: -
  m0, b02: |00000000000000000000000000000000| delays: -
  m0, b03: |11111110000000000000000000000000| delays: 48+-48
  m0, b04: |00000000011111111111000000000000| delays: 223+-80
  m0, b05: |00000000000000000000000111111111| delays: 435+-73
  m0, b06: |00000000000000000000000000000000| delays: -
  m0, b07: |00000000000000000000000000000000| delays: -
  best: m0, b04 delays: 222+-79
  m1, b00: |00000000000000000000000000000000| delays: -
  m1, b01: |00000000000000000000000000000000| delays: -
  m1, b02: |00000000000000000000000000000000| delays: -
  m1, b03: |11111110000000000000000000000000| delays: 51+-51
  m1, b04: |00000000001111111111000000000000| delays: 233+-82
  m1, b05: |00000000000000000000000111111111| delays: 437+-73
  m1, b06: |00000000000000000000000000000000| delays: -
  m1, b07: |00000000000000000000000000000000| delays: -
  best: m1, b04 delays: 233+-82
  m2, b00: |00000000000000000000000000000000| delays: -
  m2, b01: |00000000000000000000000000000000| delays: -
  m2, b02: |00000000000000000000000000000000| delays: -
  m2, b03: |11110000000000000000000000000000| delays: 25+-25
  m2, b04: |00000001111111111000000000000000| delays: 184+-83
  m2, b05: |00000000000000000000111111111100| delays: 389+-81
  m2, b06: |00000000000000000000000000000000| delays: -
  m2, b07: |00000000000000000000000000000000| delays: -
  best: m2, b04 delays: 185+-84
  m3, b00: |00000000000000000000000000000000| delays: -
  m3, b01: |00000000000000000000000000000000| delays: -
  m3, b02: |00000000000000000000000000000000| delays: -
  m3, b03: |11111100000000000000000000000000| delays: 40+-40
  m3, b04: |00000000111111111110000000000000| delays: 210+-83
  m3, b05: |00000000000000000000011111111111| delays: 418+-84
  m3, b06: |00000000000000000000000000000000| delays: -
  m3, b07: |00000000000000000000000000000000| delays: -
  best: m3, b04 delays: 211+-83
  m4, b00: |00000000000000000000000000000000| delays: -
  m4, b01: |00000000000000000000000000000000| delays: -
  m4, b02: |00000000000000000000000000000000| delays: -
  m4, b03: |11000000000000000000000000000000| delays: 11+-11
  m4, b04: |00000111111111100000000000000000| delays: 153+-86
  m4, b05: |00000000000000000011111111110000| delays: 360+-82
  m4, b06: |00000000000000000000000000000001| delays: 501+-09
  m4, b07: |00000000000000000000000000000000| delays: -
  best: m4, b04 delays: 152+-84
  m5, b00: |00000000000000000000000000000000| delays: -
  m5, b01: |00000000000000000000000000000000| delays: -
  m5, b02: |00000000000000000000000000000000| delays: -
  m5, b03: |10000000000000000000000000000000| delays: 08+-08
  m5, b04: |00000111111111100000000000000000| delays: 152+-80
  m5, b05: |00000000000000000111111111110000| delays: 355+-81
  m5, b06: |00000000000000000000000000000001| delays: 503+-08
  m5, b07: |00000000000000000000000000000000| delays: -
  best: m5, b05 delays: 354+-83
  m6, b00: |00000000000000000000000000000000| delays: -
  m6, b01: |00000000000000000000000000000000| delays: -
  m6, b02: |00000000000000000000000000000000| delays: -
  m6, b03: |00000000000000000000000000000000| delays: -
  m6, b04: |00011111111111000000000000000000| delays: 132+-85
  m6, b05: |00000000000000000111111111100000| delays: 339+-82
  m6, b06: |00000000000000000000000000000011| delays: 491+-20
  m6, b07: |00000000000000000000000000000000| delays: -
  best: m6, b04 delays: 132+-86
  m7, b00: |00000000000000000000000000000000| delays: -
  m7, b01: |00000000000000000000000000000000| delays: -
  m7, b02: |00000000000000000000000000000000| delays: -
  m7, b03: |00000000000000000000000000000000| delays: -
  m7, b04: |00111111111110000000000000000000| delays: 118+-89
  m7, b05: |00000000000000001111111111000000| delays: 334+-81
  m7, b06: |00000000000000000000000000000111| delays: 486+-25
  m7, b07: |00000000000000000000000000000000| delays: -
  best: m7, b04 delays: 118+-88
Switching SDRAM to hardware control.
Memtest at 0x40000000 (2.0MiB)...
  Write: 0x40000000-0x40200000 2.0MiB
   Read: 0x40000000-0x40200000 2.0MiB
Memtest OK
Memspeed at 0x40000000 (Sequential, 2.0MiB)...
  Write speed: 93.7MiB/s
   Read speed: 78.1MiB/s

@jersey99
Copy link
Contributor Author

jersey99 commented Oct 7, 2023

Hi @sususjysjy, Thanks a lot for your help, for some reason, I still get the same error with the MT40A512M16 instantiated. Basically half the memory fails, at this point. I am thinking it has something to do with A17 / BA/ BG configuration. When I look at the datasheet for MT40A512M16, A17 seems to be used for some settings (and it clearly isn't connected on our board). Do you mind sharing your pin settings when you get a chance? Thanks in advance!

@jersey99
Copy link
Contributor Author

jersey99 commented Oct 9, 2023

@sususjysjy I managed to get it to work with MT40A512M16, messing around with CS_N settings. I still don't quite understand why this works, MT40A512M16 is an 8GB part, and the Memory I have on board is 4GB. Some settings magic I need to figure out. Another thing that stumps me is how this doesn't work with SPD dump. I would love to discuss this further with any one who has something to say here. Meanwhile I will keep poking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants