Question : Can WDOG_DISBLE be toggled on the fly during system operation
Answer: WDOG_DISABLE status is only latched during boot, to enable or disable watch dog timer
throughout the whole phone operation. Toggling this pin after boot (on the fly) during
phone operation would not change the watch dog functionality.
Question : How to get the log packet information from the source code directly by OEM
Question : Generally, after phone is connected to QXDM, customer can find many information, such as MM information or CELL information, etc.., but OEM customer often needs to show those information in phone directly, and How to get the related parameters from the source code directly by OEM?
Question:
Generally, after phone is connected to QXDM, customer can find many information, such as MM information or CELL information, etc.., but OEM customer often needs to show those information in phone directly, and How to get the related parameters from the source code directly by OEM?
Answer:
For each parameter shows on QXDM, it has the corresponding log packet which can be found at ICD document Qualcomm provided to customer(see 80-V2708-1/80-V4083-1/80-V5295-1).
And for each log packet, it also has one log code. Example for "UMTS NAS MM characteristics", its log code is 0x7135.
In Qualcomm source code, there is a unique macro variable match with this log code.
Example, in log_codes.h, it defined some basic variable
#define LOG_1X_BASE_C ((uint16) 0x1000)
#define LOG_WCDMA_BASE_C ((uint16) 0x4000)
#define LOG_GSM_BASE_C ((uint16) 0x5000)
#define LOG_LBS_BASE_C ((uint16) 0x6000)
#define LOG_UMTS_BASE_C ((uint16) 0x7000)
#define LOG_TDMA_BASE_C ((uint16) 0x8000)
#define LOG_DTV_BASE_C ((uint16) 0xA000)
#define LOG_APPS_BASE_C ((uint16) 0xB000)
#define LOG_DSP_BASE_C ((uint16) 0xC000)
#define LOG_TOOLS_BASE_C ((uint16) 0xF000)
And for Log code 0x7135, it is defined at log_codes_umts.h
#define LOG_UMTS_MM_INFO_LOG_PACKET_C (0x135 + LOG_UMTS_BASE_C)
Based on Qualcomm implementation, there are 2 functions are used to send the log packet to DIAG, and then the related parameter is showed at the QXDM
Log_commit()
Log_submit()
And for LOG_UMTS_MM_INFO_LOG_PACKET_C , it is sent to DIAG at mmsend_mm_info_log_packet().
For each log packet, it also has the corresponding structure, it dose not like traditional structure definition, just as:
LOG_RECORD_DEFINE(XXX)
...
LOG_RECORD_END
Example for LOG_UMTS_MM_INFO_LOG_PACKET_C, its structure is "LOG_UMTS_MM_INFO_LOG_PACKET_C_type" and structure unit is defined as below:
LOG_RECORD_DEFINE(LOG_UMTS_MM_INFO_LOG_PACKET_C)
/* Network Operation mode: Class I, Class II, or Class III */
uint8 network_operation_mode;
/* Available Service: Limited, CS, PS, CS and PS, No Service */
uint8 service_type;
/* Serving Cell PLMN */
log_plmn_id_type selected_plmn_id;
/* Serving Cell LAI */
log_location_area_id_type location_area_id;
/* Serving Cell RAC */
uint8 routing_area_code;
/* Number of available plmns */
uint8 num_available_plmns;
/* PLMN ids for available plmns */
log_plmn_id_type available_plmns[MAX_NUMBER_AVAILABLE_PLMNS];
LOG_RECORD_END
Above all, for OEM, those paramters can be got before log_commit()/log_submit() is called. For example if customer want to get the log packet LOG_UMTS_MM_INFO_LOG_PACKET_C, some modification should be done at mmsend_mm_info_log_packet().
Question : how to debug software dog timeout issue
Question: How to debug Software Watchdog Timeout issue
Answer: You can use below procedure to find the culprit
Detail:
1) Symptom: Common symptom of DOG failures are below
Call stack: DOG task shows below call stack or something similar.
blast_fatal_exit(asm)
err_fatal_handler()
dog_task(?)
rex_qube_task_entry_func_wrapper(?)
trampoline(?)
blast_trampoline(?)
end of frame
and coredump usually has task name only with linenumber 0x0
coredump.err = (
version = 0x1,
linenum = 0x0,
timestamp = (0xA374382D, 0x00BA327F),
uptime = (0x0, 0x0),
filename = "CM",
message = "",
param = (0x0, 0x0, 0x0))
2) Basic understanding of Software Watchdog : We have specific structure for dog task operation.
Named dog_state_table[].
Dog is basically monitoring other thread running well. each thread has its own timeout values, and the thread
should report dog within the timeout values.
If it fails to report dog, dog task consider the task unhealthy and made error fatal, which we call dog timeout
failure.
In the dog_state_table[] structure, timeout is the task’s specific timeout value, and count is decreased every
time when dog runs.
count is set to timeout values every time the task reports dog, so if count is decreased to 0x0, it means the
task fails to report dog within the timeout period. on dog timeout failure, you could see count = 0x0 (in old version of DOG) or err_cnt=0x1 (in newer version of DOG). In below snapshot, CM count is already set to initial timeout value (0xC), however we can see err_cnt is 0x1, which CM failed to report dog earlier.
dog_state_table = (
(err_cnt = 0x0, rpt_cnt = 0x062F, task = "startup_app", count = 0x0B, timeout = 0x0C, p_tcb = 0x02C8DB98 = end+0x51B98, is_blocked = 0x0, timeout
(err_cnt = 0x0, rpt_cnt = 0x062F, task = "DS MSGR REC", count = 0x0B, timeout = 0x0C, p_tcb = 0x02D3E0B8 = end+0x1020B8, is_blocked = 0x0, timeout
(err_cnt = 0x0, rpt_cnt = 0x0630, task = "1X_TX", count = 0x0B, timeout = 0x0C, p_tcb = 0x027F7854 = tx_tcb, is_blocked = 0x0, timeout_flag = 0xFF
(err_cnt = 0x0, rpt_cnt = 0x0630, task = "1X_RX", count = 0x0B, timeout = 0x0C, p_tcb = 0x027F6EE0 = rx_tcb, is_blocked = 0x0, timeout_flag = 0xFF
(err_cnt = 0x1, rpt_cnt = 0x0662, task = "CM", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F8590 = cm_tcb, is_blocked = 0x0, timeout_flag = 0xFF, d
(err_cnt = 0x0, rpt_cnt = 0x0627, task = "MMOC", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F380C = mmoc_tcb, is_blocked = 0x0, timeout_flag = 0xF
(err_cnt = 0x0, rpt_cnt = 0x00032818, task = "HDRMC", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F3F9C = hdrmc_tcb, is_blocked = 0x0, timeout_flag
(err_cnt = 0x0, rpt_cnt = 0x000244BF, task = "HDRTX", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F4364 = hdrtx_tcb, is_blocked = 0x0, timeout_flag
(err_cnt = 0x0, rpt_cnt = 0x6178, task = "HDRRX", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F39F0 = hdrrx_tcb, is_blocked = 0x0, timeout_flag = 0
(err_cnt = 0x0, rpt_cnt = 0x0001301F, task = "HDRDEC", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F7FE4 = hdrdec_tcb, is_blocked = 0x0, timeout_fl
(err_cnt = 0x0, rpt_cnt = 0x0DDF, task = "HDRBC", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F7C1C = hdrbc_tcb, is_blocked = 0x0, timeout_flag = 0
(err_cnt = 0x0, rpt_cnt = 0x2540, task = "DIAG", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F3260 = diag_tcb, is_blocked = 0x0, timeout_flag = 0xF
(err_cnt = 0x0, rpt_cnt = 0x0637, task = "DIAG_FWD_TA", count = 0x0C, timeout = 0x0C, p_tcb = 0x028FBF00 = diag_fwd_task_tcb, is_blocked = 0x0, ti
(err_cnt = 0x0, rpt_cnt = 0x0626, task = "UIM", count = 0x2, timeout = 0x0C, p_tcb = 0x027F94B0 = uim_tcb, is_blocked = 0x0, timeout_flag = 0xFF,
(err_cnt = 0x0, rpt_cnt = 0x20C8, task = "GSDI", count = 0x2, timeout = 0x0C, p_tcb = 0x027F5830 = gsdi_tcb, is_blocked = 0x0, timeout_flag = 0xFF
(err_cnt = 0x0, rpt_cnt = 0x08C1, task = "GSTK", count = 0x2, timeout = 0x0C, p_tcb = 0x027F5FC0 = gstk_tcb, is_blocked = 0x0, timeout_flag = 0xFF
(err_cnt = 0x0, rpt_cnt = 0x0941, task = "MM", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F5284 = mm_tcb, is_blocked = 0x0, timeout_flag = 0xFF, d
(err_cnt = 0x0, rpt_cnt = 0x0A3D, task = "REG", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F748C = reg_tcb, is_blocked = 0x0, timeout_flag = 0xFF,
(err_cnt = 0x0, rpt_cnt = 0x0630, task = "MN CNM MAIN", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F2E98 = mn_cnm_tcb, is_blocked = 0x0, timeout_f
(err_cnt = 0x0, rpt_cnt = 0x0630, task = "SM", count = 0x0C, timeout = 0x0C, p_tcb = 0x027F9C40 = sm_tcb, is_blocked = 0x0, timeout_flag = 0xFF, d
By (1) and (2), we can conclude, CM didn’t report dog without certain time, and that resulted in dog failure.
3) How to find culprit: Now we need to figure out why CM failed to report DOG in this case.
Common reasons are three below.
3.A) The specific task was busy doing something
The task should be designed to check dog report timeout signal periodically. the signal and timer is used for reporting dog. In most of tasks, checking dog report timeout signal is usually done in main loop. But if the thread did soemething heavy in sub routines, this may result in the task not to report dog.
To check this possibility, you can check the call stack of the failed thread. common problem in this category is that the thread is in infinite loop or looping for long time. You might be able to see what the thread was doing at the moment by the call stack usually. For the call stack recovery, you can see KB#00021831 - complete procedure for stack recovery
3.B) The specific task was waiting for something but dog report timeout signal
3.B.1) check it was wating for critical section or mutex, you can see the call stack shows the stuff.
3.B.2) If it is waiting for rex signals, you would see rex_wait() in the call stack.check rex_tcb of the thread. Dog report timer sig is usually 0x1. so if corresponding rex_tcb.wait doesn’t have 0x1 while corresponding rex_tcb.sigs has 0x1, which means the task was not waiting dog report timer sig when the sig arrives. If the thread is “CM”, then we should check cm_tcb.
3.B.3) if it is waiting for critical section, you would see qmutex_lock() [in QuRT] or rex_enter_crit_sect() [in L4]. Then you need to check who owns the critical section at the moment. on QuRT, you can check this easily. Refer KB#00021723 - How to find mutex holder in BLAST environment for further information on this.
3.C) the specific thread did nothing wrong, however was just affected by starvation,
3.C.1) you can see the thread is READY but not running in this case, in this case, the thread who failed to report dog is not a culprit, and it got the dog report timer sig and about to process, but just didn’t get any chance to run due to other higher priority tasks’ activities
3.C.2) you need to check what other tasks run at the moment, you need to check thread switch list and find out what others threads prevent the failed thread from running. and check those thread’s CPU consumption is reasonable or not.
3.C.3) if one specific thread took CPU for long time, you need to check the thread activity by looking at call stack. If the thread is in some kind of infinite loop or longer loop unexpectedly, that might be culprit
3.C.4) even though you cannot find any loop in those thread, the loop might be among multiple thread. For example, thread A send signal to thread B, thread B send signal to thread C, and C to A. in this case, no thread has loop kind of stuff. However system wise, it’s also loop. You can check this by thread switch list, if you can see repeating pattern of simliar thread activity, you can suspect this
3.C.5) if all those thread runs expectedly (as per tech point of view), but still starvation is observed, we can say this as Performance issue. simply saying, the thread activity is more than what CPU can cover. We can say this as MIPS issue (million instructions per second (MIPS) – not really accurate term though), as CPU MIPS is lower than what we need. This category of problem is hard to fix, and it usually requires optimization of module.
3.C.6) in QuRT environment, there might be one more concern than above. In QuRT, we can assign software thread to specific hardware thread only. in this case, there might be a situation that the specific software thread didn’t run due to the associated hardware thread is too busy with other higher priority tasks even though other CPU is idle enough.
Question : How to insert assembly instruction in L4/Rex BSP and driver code
Answer : The ARM C compiler support inline assembly language with the __asm specifier. The asm statement must be inside a C function. The syntax would be:
__asm
{
instruction [; instruction]
...
...
[instruction]
}
An example code:
__asm
{
MOV r0,#0x0
MCR p15,0,r0,c7,c10,4 /* Data Synchronization Barrier */
}
Here are the rules that should be known before using this keyword:
1. If you include multiple instructions on the same line, you must separate them with a semicolon (;). If you use double quotes, you must enclose all the
instructions within a single set of double quotes.
2. If an instruction requires more than one line, you must specify the line continuation with the backslash character \.
3. For the multiple line format, you can use C or C++ comments anywhere in the inline assembly language block. However, you cannot embed comments
in a line that contains multiple instructions.
4. Comma is used as a separator in assembly language, so C expressions with the comma operator must be enclosed in parentheses to distinguish them:
__asm {ADD x, y, (f(), z)}
5. Register names in the inline assembler are treated as C variables. They do not necessarily relate to the physical register of the same name. If you do not
declare the register as a C variable, then the compiler warns you that it should be declared as a variable.
6. Do not save and restore registers in inline assembler. The compiler does this for you. Also, the inline assembler does not provide direct access to the
physical registers. If registers other than CPSR and SPSR are read without being written to, an error message is issued. For example:
int f(int x)
{
__asm
{
STMFD sp!, {r0} // save r0 - illegal: read before write
ADD r0, x, 1
EOR x, r0, x
LDMFD sp!, {r0} // restore r0 - not needed.
}
return x;
}
The function must be written as:
int f(int x)
{
int r0;
__asm
{
ADD r0, x, 1
EOR x, r0, x
}
return x;
}
Question : What is the difference between "INTLOCK()" and "rex_enter_crit_sect()"?
Question: What is the difference between "INTLOCK()" and "rex_enter_crit_sect()"?
Answer: INTLOCK() API locks all the interupts and prevents thread switches. Rex critical sections provide a means to
allow only one task to execute
the protected section of code and does not block other tasks that do not attempt to enter the same portected
section or prevent interrupts in general.
Details: We generally recommend use of rex critical section over INTLOCK(). Only one task is allowed to enter a
critical section at a time.
All the other tasks that wish to enter the same critical section will have to wait till the critical section is
exited by the initial task.
Among all the waiting tasks the task with the highest priority then gets to enter the critical section.
So when possible, we encourage the use of critical sections since they do not block all the tasks and interrupts
unnecessarily. Document 80-V1422-1 gives more details of implementation of REX API on the L4 kernel.
Question :How to change reset vectors
Answer : 1 What is reset vector (in ARM)
When an exception is taken, processor execution is forced to an address that corresponds to the type of exception. These addresses are
called the exception vectors. By default, the exception vectors are eight consecutive word-aligned (0x20 bytes) memory addresses,
starting at an exception base address. the exception base address could be either 0x0000_0000 or 0xFFFF_0000, depending on SCTLR.V bit. 1
2 How to change the reset vector
Reset vector can be changed by modifying the linker script (.SCL file). Take osbl_nand.scl as an example, a new section can be added to
place the reset vector code at the beginning of IMEM.
OSBL_VEC_TBL OSBL_VECTOR_TABLE_BASE 0x20{ osbl.o (OSBL_VECTOR_TABLE)}It should be made sure that the code have been copied to the
beginning of IMEM. See function osbl_create_vector_table() for details. Please note the real magic is copying the vector to
IMEM_BASE - IMEM_BASE+0x20. Therefore, it is fine if the linker script is not changed, rather the vector is copied directly.
3 How does it work
PBL will set the initial reset vector at 0xFFFF_0000, which is defined by ARM processor, at this address:
ffff0000 <__main>:ffff0000: ea000006 b ffff0020 <pbl_reset_handler>ffff0004: ea00005b b ffff0178 <pbl_undefined_instruction_handler>ffff0008: ea00005e b ffff0188 <pbl_swi_handler>ffff000c: ea000061 b ffff0198 <pbl_prefetch_abort_handler>ffff0010: ea000064 b ffff01a8 <pbl_data_abort_handler>ffff0014: ea000067 b ffff01b8 <pbl_reserved>ffff0018: ea00006a b ffff01c8 <pbl_irq_handler>ffff001c: ea00006d b ffff01d8 <pbl_fiq_handler>Let's take pbl_data_abort_handler at 0xFFFF_0010 as an example:
ffff01a8 <pbl_data_abort_handler>:ffff01a8: e50d4004 str r4, [sp, #-4]ffff01ac: e59f41bc ldr r4, [pc, #444] ; ffff0370 <Load$$PBL_USB_DEBUG_DATA$$Base+0xffff0ad0>ffff01b0: e5944000 ldr r4, [r4]ffff01b4: e12fff14 bx r4Please note instruction 0xFFFF_01AC will load 0xFFFF_0370, which is:
ffff0370: 20000010 .word 0x20000010Please note 0x2000_0000 is IMEM beginning address for the reference platform (QSC6695). If we have the PBL map file, we can also check the map file instead:
unused_reset_vector 0x20000000 Data 0 pbl.o(IMEM_VECTOR_TABLE)IMEM_VECTOR_TABLE 0x20000000 Section 32 pbl.o(IMEM_VECTOR_TABLE)undefined_instruction_vector 0x20000004 Data 0 pbl.o(IMEM_VECTOR_TABLE)swi_vector 0x20000008 Data 0 pbl.o(IMEM_VECTOR_TABLE)prefetch_abort_vector 0x2000000c Data 0 pbl.o(IMEM_VECTOR_TABLE)data_abort_vector 0x20000010 Data 0 pbl.o(IMEM_VECTOR_TABLE)reserved_vector 0x20000014 Data 0 pbl.o(IMEM_VECTOR_TABLE)irq_vector 0x20000018 Data 0 pbl.o(IMEM_VECTOR_TABLE)fiq_vector 0x2000001c Data 0 pbl.o(IMEM_VECTOR_TABLE)So when exception happen, we will first jump to the real reset vector, which is 0xFFFF_0000 to 0xFFFF_0020, depending on the type of exception. At that place, it is PBL code, which will jump to IMEM_BASE to IMEM_BASE+0x20, where we put our own reset vector code.
Footnotes:
1 the address is 0xFFFF_0000 because SCTLR.V is 1, otherwise the reset vector is 0x0000_0000.
Question : Why Trace 32 Lauterbach debugger GUI always meet data abort in L4 based AMSS
Question: Why Trace 32 Lauterbach debugger GUI always meet data abort in L4 based AMSS?
Answer: This is normal in L4 based software, it is needed for setting up MMU mapping.
Detail:
In L4 based amss software, the data abort is normal, it will cause iguana service to create the mapping between virtual adress and physical address.
And progressive load also depend on it.
To avoid always break in trace32, you need to disable onchip trigger in trace32 menu: Break/On-chip trigger
or add your trace32's config.t32
TRONCHIP.SET IRQ OFF
TRONCHIP.SET DABORT OFF
TRONCHIP.SET PABORT OFF
TRONCHIP.SET SWI OFF
TRONCHIP.SET UNDEF OFF
TRONCHIP.SET RESET OFF
TRONCHIP.SET STEPVECTOR OFF
generally, if you load by cmm script, the on-chip trigger will be disabled in cmm scripts.
Question : When data abort message/error happens, how to find out the root cause
Answer: Document 80-VF749-1, 80-VG736-1, and following procedure will guide you to dig into software watchdog timeout issues.
Detail: You could find the error message in F3, SMEM log, or L4 message buffer. If you only know it's a data abort, and F3/SMEM logs are not
available, you could get it from L4 message buffer. The method to get L4 message buffer is type "task.console" or
"d.dump jtag_console_buffer /nohex" in Trace 32 simulator. Of course you need to load the RAM dump to T32 simulator before everything.
In f3, the error log looks like:
err_exception_handler.c ExIPC: PFault 18f47000 @ 1838ce3e, tid=194001
Unhandled page fault:. addr=0x18f47000/??? priv=R. ip=0x1838ce3e, sp=b0041230. pd=0xb004f8b8 thread=0x194001.Thread 194001(SM_TM)
It means the data abort happened at instruction 0x1838CE3E. You could locate to this instruction and find which function/source code is
realted to this.
and you can use pf_dbg_msg[] and qxdm_dbg_msg[] for this kind of error information. as well.
Question : How to analyze data abort sympton
Question: When data abort message/error happens, how to find out the root cause?
Answer: Document 80-VF749-1, 80-VG736-1, and following procedure will guide you to dig into software watchdog timeout issues.
Detail: You could find the error message in F3, SMEM log, or L4 message buffer. If you only know it's a data abort, and F3/SMEM logs are not
available, you could get it from L4 message buffer. The method to get L4 message buffer is type "task.console" or
"d.dump jtag_console_buffer /nohex" in Trace 32 simulator. Of course you need to load the RAM dump to T32 simulator before everything.
In f3, the error log looks like:
err_exception_handler.c ExIPC: PFault 18f47000 @ 1838ce3e, tid=194001
Unhandled page fault:. addr=0x18f47000/??? priv=R. ip=0x1838ce3e, sp=b0041230. pd=0xb004f8b8 thread=0x194001.Thread 194001(SM_TM)
It means the data abort happened at instruction 0x1838CE3E. You could locate to this instruction and find which function/source code is
realted to this.
and you can use pf_dbg_msg[] and qxdm_dbg_msg[] for this kind of error information. as well.
Question : How to restore ARM9 MMU manually on L4 target
Answer: How to restore ARM9 MMU manually on L4 target
When load a QPST memory dump, script load_log.cmm will try to restore mmu automatically by looking up l4 kernel variable mmu_data in init.cc.
But sometimes (rare), we may can't restore mmu correctly, we can also restore the mmu settings manually, which is also simple.
By check mmu_data, we know we must restore 3 registers: CR, TTBR and DACR.
Table of Contents
Restore CR Register
Restore TTBR Register
Restore DACR Register
Make it work
1 Restore CR Register
CR Register is SCTLR Register from ARM reference manual. By check drivers/boot/cache_mmu.s, we know the default value for CR register is
0x50078, our ARM9 is booting from 0xFFFF_0000, so it should be ORed with V bit, and we also enable icache/dcache, and MMU. so we can set
CR to 0x5317d.
2 Restore TTBR Register
We can consider TTBR Register is the page table address. The page table address is _kernel_space_pagetable from L4 kernel, also in init.cc.
We can view address of _kernel_space_pagetable, IE: `` v.v _kernel_space_pagetable'' shows its address is 0xF002_4000 on my target.
Please note L4 kernel is relocated in head.spp, so the physical address should be:
(&_kernel_space_pagetable - 0xF000_0000) + &__phys_addr_ramPlease note 0xF000_0000 is L4 kernel base address (VIRT_ADDR_BASE).
From quartz_cfg.xml, we can know __phys_addr_ram is 0x100000.
<kernel file="build/pistachio/bin/kernel" xip="false" > <patch address="__phys_addr_ram" value="0x100000" bytes="4"/>
<heap size="512K" /> <patch address="profiling_on" value="0x1" bytes="4"/></kernel>So TTBR should be set to 0x124000
3 Restore DACR Register
DACR Register controls the access permisson to all memory, it will allow complex memory access control, like overlapping, … for our case,
we can just set DACR to 0x1, or 0x5555_5555, both will grant access to all memory.
4 Make it work
With the above steps, we have setup baisc MMU registers, we can type:
mmu.offmmu.scanmmu.onin Trace32, after mmu.on, we should able to see mmu setup properly.
For more details about CR, TTBR & DACR, please see ARM architecture reference manual (DDI0406B).
Question : How to extend/increase the L4 thread switch list
Answer : On 7x30, currently only 168 entries of thread switch list are stored in the trace_buffer. Some times this may not be enough to debug few
cases when you would want to know the switch list of the recent past.
To get more entries we need to increase the trace buffer size.
This is done by adding the following line in
<kernel file="build_M/pistachio/bin/kernel" xip="false" >
<patch address="__phys_addr_ram" value=SCL_MODEM_CODE_BASE bytes="4"/>
<patch address="trace_buffer_size" value="0x8000" bytes="4"/>
<patch address="profiling_on" value="1" bytes="4"/>
We can either use 0x8000(32K) or 0x10000(64K). By changing to 0x10000 we get approximately 2728.
Also please make sure that the heap size is high enough.
To be on the safer side, please have the following line as well
Question : How to calculate the RAM memory requirement for the ARM9 image in MDM9K target
Answer : From the directory build\ms\bin\SCAQxxx which contains the images for all the processors:
Launch the command:
readelf -l amss.mbn
The result is something like:
Entry Point 0x100000 <-- defined in file build\ms\custscaqxxx.h "#define SCL_MODEM_CODE_BASE 0x0010000"
There are 12 program headers
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x001000 0x005b1000 0x005b1000 0x00118 0x01000 0 <-- The Highest Starting Addr
LOAD 0x008000 0xf0000000 0x00100000 0x1ca8c 0x222bc RWE 0x8000
LOAD 0x02c000 0xf0024000 0x00124000 0x06000 0x06000 RW 0x8000
LOAD 0x038000 0xb0000000 0x00190000 0x1539f 0x1539f R E 0x8000
LOAD 0x050000 0xb0060000 0x001b0000 0x00188 0x13004 RW 0x8000
LOAD 0x052000 0x001c4000 0x001c4000 0x13146c 0x3d5000 RWE 0x2000
LOAD 0x184000 0x00599000 0x00599000 0x02000 0x02000 R E 0x1000
LOAD 0x187000 0x0059b000 0x0059b000 0x00000 0x06000 RW 0x1000
LOAD 0x188000 0x005a1000 0x005a1000 0x00000 0x01000 RW 0x1000
LOAD 0x189000 0x005a2000 0x005a2000 0x02000 0x02000 RW 0x1000
LOAD 0x18c000 0x20000000 0x20000000 0x02000 0x02000 R E 0x1000
LOAD 0x18f000 0xb0028000 0x005a8000 0x09000 0x09000 RW 0x1000
Look for the section with the highest start address excluding the page segment (0x005b1000) and add its size (0x01000) = 0x005b2000.
The entry point is clearly stated first (Entry point 0x100000). Remove the entry point from the result:
0x005b2000 – 0x100000 = 0x04b2000 = 4.7 MB
The sum of the section sizes (sum of MemSize), excluding the paged section (address 0x20000000) indicates the padding in the image’s map.
L4 is able to use these gaps for the LibC heap. In this example, the sum of the sections is 4.2 MB, which means that about 512 KB should be
available for the heap within the 4.7 MB of memory occupied by the image.
The heap requirement that must be added to this number is 1.5 MB, Also, the top boundary of the partition must be 512KB aligned.
So the minimum partition size in this example would be:
4.7 (image size) – 0.5 (padding) + 1.5 (LibC heap) = 5.7 MB
With the alignment constraint, it must round to 6 MB.
Question : Where PMEM is allocated from
Answer : PMEM is a physical memory used for 1:1 mapping for DMA purpose.
It can be allocated in different pre-allocated regions, such as Modem heap, MM heap1, MM heap2 and unused SDRAM.
1. For Modem side in dual processors or single processor.
quartz_X.xml specify from where we allocate the PMEM, such as
<memsection name="pmem" direct="true" virt_addr=SCL_MODEM_HEAP1_BASE size=SCL_MODEM_HEAP1_SIZE attach="rwx" cache_policy="uncachedbufferable" virtpool="virtual_pmem_heap" physpool="physical_pmem_heap" zero="false" />
It means pmem is allocated from MODEM HEAP1.
If not specified in quartz_X.xml, the pmem is allocated from unused SDRAM; the unused SDRAM means the unused memory managed by iguana for its processes.
2. For the application side in dual processors
pmem is allocated from SMI or EBI1 per the configuration in pmem_7k_init_ebi1() and pmem_7k_init_smi(), the PMEM allocated in SMI can used for higher performance and concurrence usage.
Question: How to get memory use such as ro, rw, zi of Hexagon ELF
Answer: Hexagon linker is not capable to populate the memory map output in a format of ARM linker output generated by --map option.
However there is a way to get aware of the sizes of code, data and zi.
First I'd like to recommend you to utilize elfweaver.exe to print out the contents of the elf file.
Run "elfweaver.exe print -a M9X00ASCAQMAZQ1013.elf >> M9X00ASCAQMAZQ1013.elf.txt" and its output shows as below.
--------------------------------------------------------------------
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: <unknown>: a4
Version: 0x1
Entry point address: 0xf00000
Start of program headers: 52 (bytes into file)
Start of section headers: 146392096 (bytes into file)
Flags: 0x1
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 4
Size of section headers: 40 (bytes)
Number of section headers: 25
Section header string table index: 22
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .start PROGBITS 00f00000 001000 0002a0 00 AX 0 0 32
[ 2] .init PROGBITS 00f002a0 0012a0 000060 00 AX 0 0 32
[ 3] .text PROGBITS 00f00300 001300 fa73a0 00 AX 0 0 64
[ 4] .fini PROGBITS 01ea76a0 fa86a0 000040 00 AX 0 0 32
[ 5] .rodata PROGBITS 01ea8000 fa9000 42a60c 00 A 0 0 8
[ 6] .data PROGBITS 022d4000 13d4000 0aec70 00 WA 0 0 8
[ 7] .eh_frame PROGBITS 02382c80 1482c80 0d797c 00 WA 0 0 32
[ 8] .gcc_except_table PROGBITS 0245a5fc 155a5fc 003b1c 00 WA 0 0 4
[ 9] .ctors PROGBITS 0245e118 155e118 000018 00 WA 0 0 4
[10] .dtors PROGBITS 0245e130 155e130 00000c 00 WA 0 0 4
[11] .bss NOBITS 0245e180 155e180 100c180 00 WA 0 0 128
[12] .sdata PROGBITS 03480000 155f000 01b898 00 WAp 0 0 32
[13] .sbss NOBITS 0349b8c0 157a8c0 000080 00 WAp 0 0 8
[14] .debug_aranges PROGBITS 00000000 157a8c0 07df58 00 0 0 1
[15] .debug_pubnames PROGBITS 00000000 15f8818 1b0dc6 00 0 0 1
[16] .debug_info PROGBITS 00000000 17a95de 55e9628 00 0 0 1
[17] .debug_abbrev PROGBITS 00000000 6d92c06 1c5283 00 0 0 1
[18] .debug_line PROGBITS 00000000 6f57e89 149df6d 00 0 0 1
[19] .debug_frame PROGBITS 00000000 83f5df8 159b00 00 0 0 4
[20] .debug_str PROGBITS 00000000 854f8f8 592f57 01 MS 0 0 1
[21] .debug_ranges PROGBITS 00000000 8ae284f 0b9ae8 00 0 0 1
[22] .shstrtab STRTAB 00000000 8b9c337 0000e9 00 0 0 1
[23] .symtab SYMTAB 00000000 8b9c808 252f20 10 24 91199 4
[24] .strtab STRTAB 00000000 8def728 211d6e 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x001000 0x00f00000 0x00f00000 0xfa76e0 0xfa76e0 R E 0x1000
LOAD 0xfa9000 0x01ea8000 0x01ea8000 0x42a60c 0x42a60c R 0x1000
LOAD 0x13d4000 0x022d4000 0x022d4000 0x18a13c 0x1196300 RW 0x1000
LOAD 0x155f000 0x03480000 0x03480000 0x1b898 0x1b940 RW 0x1000
Section to Segment mapping:
Segment Sections...
00 .start .init .text .fini
01 .rodata
02 .data .eh_frame .gcc_except_table .ctors .dtors .bss
03 .sdata .sbss
There is no KIP in this file.
There is no Bootinfo section in this file.
--------------------------------------------------------------------
An executable object file can include a Section Header Table and a Program Header Table which describes Sections and Segments respectively
under the ELF spec.
TYPE attribute in each section header table entry has PROGBITS or NOBITS (for zero init areas)
and SIZE attribute indicates size of each section entry.
Here are some of common section header constants for section names.
.text - text section
.init - initialization text section for shared libraries
.fini - cleanup text section
.rodata - read-only data section
.data - large data section
.sdata - small data section
.bss - large bss section (bss means zero init area)
.sbss - small bss section
From the program header table entry we can regonize virtual address, physical address, size of each segment entry.
FileSiz is amount of occupancy in ELF file space and MemSiz is amount of occupancy in memory space.
If one segment contains ro attributed section and zi attributed section as well, FileSiz and MemSiz could be different as shown in segment 02 and segment 03 in the above example.
Let's find out how much of memory is used for ro, rw, and zi in the above output of ELF contents .
RO (code+rodata) : 0xfa76e0 (MemSiz in segment 00) + 0x42a60c (MemSiz in segment 01) = 0x13d1cec
RW (rw data) : 0x18a13c (FileSiz in segment 02) + 0x1b898 (FileSiz in segment 03) = 0x1a59d4
ZI (zero init data) : 0x1196300 - 0x18a13c (MemSiz-FileSiz of seg 02) + 0x1b940 - 0x1b898 (MemSiz-FileSiz of seg03) = 0x100c26c
Question: Where are the L4/Iguana kernel internal heaps defined and placed?
Answer: L4/Iguana kernel makes use of a small amount of memory for its internal use. The kernel heaps themselves are usually defined in the
quartz_cfg_<buildid>.xml and quartz_cfg_machine_<buildid>.xml(7k targets only) files. In 7k targets, the AMSS heap is also defined in the
same files.
Details: The total physical memory available on the device is specified to L4/iguana using the same file as follows:
<physical_memory base="0x00100000" size="0x07f00000" />
Elfweaver usually places the kernel code and internal heaps in the beginning of this address. Once this is done, it specifies the last address
location via the cust_l4_scl_<varies by build>.h file. This is where the AMSS image starts. You can get the end location of AMSS image using
the map file. The AMSS heap is specified in the quartz_cfg_<xxxx>.xml file as following
program name="AMSS" direct="true" priority="100" stack="0x8000" heap="0x10000" server_name="300" filename=AMSS_RELOC_LC >
This heap usually is placed at the end of AMSS. So the heap location usually starts at end of the AMSS. In newer builds, the entire physical
map is output in the build logs to enable easy understanding of the placement of the various heaps/sections.
Question: How to check how much memory is being used
Question: Is there a function to check how much memory is being used in graphics driver?
Answer: QCOM_memory_monitor extension is provided to collect information of the current memory status in graphics driver. GetMemoryStatsQCOM returns
how much memory is available, how much memory is being used for which purpose.
___________________________________________________________________________________________________
Function:
void GetMemoryStatsQCOM(enum pname, enum usage, int64* param)
Parameters:
<pname> can be the followings.
GL_VMEM_TOTAL_AVAILABLE_QCOM
: Returns total available graphics memory currently. <usage> parameter should be GL_ALL_USAGE_QCOM in this case.
GL_VMEM_USED_THIS_PROCESS_QCOM
: If <usage> is GL_ALL_USAGE_QCOM, returns how much graphics memory is being used for this process. If <usage> is not GL_ALL_USAGE_QCOM, returns how much graphics memory is being used for <usage> purpose in this process is acquired.
GL_VMEM_USED_ALL_PROCESSES_QCOM
: If <usage> is GL_ALL_USAGE_QCOM, returns how much graphics memory is being used for the whole system. If <usage> is not GL_ALL_USAGE_QCOM, returns how much graphics memory is being used for <usage> purpose in the whole system is acquired.
GL_VMEM_LARGEST_CONTIGUOUS_BLOCK_REMAINING_QCOM
: Returns the largest contiguous block size. Attempt to alloc memory bigger than this size would fail. <usage> is ignored.
GL_HEAPMEM_TOTAL_AVAILABLE_QCOM
: Returns total available heap memory currently. <usage> parameter should be GL_ALL_USAGE_QCOM in this case.
GL_HEAPMEM_USED_THIS_PROCESS_QCOM
: If <usage> is GL_ALL_USAGE_QCOM, returns how much heap memory is being used for this process. If <usage> is not GL_ALL_USAGE_QCOM, returns how much heap memory is being used for <usage> purpose in this process is acquired.
<usage> can be the followings.
GL_EGL_USAGE_QCOM
: Specifies memory being used by EGL, e.g.) EGL surfaces, EGL images.
GL_TEXTURE
: Memory being used by textures. e.g.) glTexImage2D.
GL_FRAMEBUFFER
: Memory being used by frame buffer objects.
GL_RENDERBUFFER
: Memory being used by render buffer objects. e.g.) glRenderbufferStorage.
GL_ARRAY_BUFFER
: Memory being used by vertex buffer objects. It can be also used for some internal vertex structures. e.g.) glBufferData.
GL_ELEMENT_ARRAY_BUFFER
: Memory being used by vertex buffer objects. It can be also used for some internal vertex structures. e.g.) glBufferData.
GL_CLIENT_VERTEX_ARRAY_QCOM
: Memory being used by internal vertex related structures.
GL_OTHER_GL_USAGE_QCOM
: OpenGLES related internal structures e.g.) binning structures, command buffer, context structures.
GL_2D_USAGE_QCOM
: Memory being used for 2D graphics work.
GL_OTHER_USAGE_QCOM
: Memory being used for other than above purposes.
GL_ALL_USAGE_QCOM
: Indicates sum of all above.
Examples:
// Get total graphics memory available.
glGetMemoryStatsQCOM(GL_VMEM_TOTAL_AVAILABLE_QCOM, GL_ALL_USAGE_QCOM, &avail);
// Get how much graphics memory is being used for texture.
glGetMemoryStatsQCOM(GL_VMEM_USED_ALL_PROCESSES_QCOM, GL_TEXTURE, &texture);
// Get how much grahpics memory being used.
glGetMemoryStatsQCOM(GL_VMEM_USED_ALL_PROCESSES_QCOM, GL_ALL_USAGE_QCOM, &used);
Question: How do I reserve memory for my own use in L4 based systems?
Question: How do I reserve memory for my own use in L4 based systems
Answer: A new memory section will have to be created with the required cache attributes and reserved for use as described below.
Details: The solution provided below is a specific solution when a non-cacheable memory section is required. However, the solution can be
extended for various purposes such as reserving a memory for sharing data between boot and AMSS, or DMA purposes or any such purposes.
In this example, we indicate the changes required on the newer kernel to reserve 1MB of memory on the apps processor.
This is the most complex requirement. For other targets with older kernels, slight alterations may be required which should be obvious by
looking at the quartz_cfg files.
The first step is to alter the memory map as required to make room for the special section.
IN the targ<xxxxxx>a.h and targ<xxxxxxx>m.h files, make changes to your memory map to reserve space for your new area.
#define SCL_APPS_CODE_BASE 0x10000000
#define SCL_APPS_CODE_SIZE 0x2000000
#define SCL_APPS_RAM_BASE 0x12000000
#define SCL_APPS_RAM_SIZE 0x3300000 /*0x3400000*/
#define SCL_APPS_AMSS_TOTAL_SIZE 0x5300000 /*0x5400000*/
#define CUST_DMOV_BUF_BASE 0x15300000
#define CUST_DMOV_BUF_SIZE 0x100000
Here we are reserving 1MB of memory from 15300000 to 15400000 for our use. In the quartz_cfg_machine.xml file, define new memory regions for you use:
<snip>
#ifdef IMAGE_APPS_PROC
#ifndef FEATURE_PMEM_7K_L4_HEAP
<!-- multimedia heaps -->
<virtual_memory name="virtual_mm_heap" >
#ifdef SCL_MM_HEAP1_BASE
<region base=SCL_MM_HEAP1_BASE size=SCL_MM_HEAP1_SIZE />
#endif
<region base=SCL_MM_HEAP2_BASE size=SCL_MM_HEAP2_SIZE />
</virtual_memory>
#endif
<!-- CUST_DMOV_BUF_BASE -->
<virtual_memory name="virtual_cust_dmov_heap" >
<region base=CUST_DMOV_BUF_BASE size=CUST_DMOV_BUF_SIZE />
</virtual_memory>
#else
<snip>
<snip>
#ifdef IMAGE_APPS_PROC
#ifndef FEATURE_PMEM_7K_L4_HEAP
<!-- multimedia heaps -->
<physical_memory name="physical_mm_heap" >
#ifdef SCL_MM_HEAP1_BASE
<region base=SCL_MM_HEAP1_BASE size=SCL_MM_HEAP1_SIZE />
#endif
<region base=SCL_MM_HEAP2_BASE size=SCL_MM_HEAP2_SIZE />
</physical_memory>
#endif
<!-- CUST_DMOV_BUF_BASE -->
<physical_memory name="physical_cust_dmov_heap" >
<region base=CUST_DMOV_BUF_BASE size=CUST_DMOV_BUF_SIZE />
</physical_memory>
#else
<snip>
<snip>
#ifdef IMAGE_APPS_PROC
#ifndef FEATURE_PMEM_7K_L4_HEAP
<virtual_pool name="virtual_mm_heap">
<memory src="virtual_mm_heap" />
</virtual_pool>
<physical_pool name="physical_mm_heap" direct="true" >
<memory src="physical_mm_heap" />
</physical_pool>
#endif
<virtual_pool name="virtual_cust_dmov_heap">
<memory src="virtual_cust_dmov_heap" />
</virtual_pool>
<physical_pool name="physical_cust_dmov_heap" direct="true" >
<memory src="physical_cust_dmov_heap" />
</physical_pool>
#else
<snip>
In the quartz_cfg.xml file, make the required changes to ensure that the area is reserved for you:
<snip>
#ifdef IMAGE_APPS_PROC
#ifndef FEATURE_PMEM_7K_L4_HEAP
<!-- Multimedia heaps -->
#ifdef SCL_MM_HEAP1_BASE
<memsection name="mm_heap1" direct="true" virt_addr=SCL_MM_HEAP1_BASE size=SCL_MM_HEAP1_SIZE attach="rwx" cache_policy="intnc_exwb" virtpool="virtual_mm_heap" physpool="physical_mm_heap" zero="false" />
#endif
<memsection name="mm_heap2" direct="true" virt_addr=SCL_MM_HEAP2_BASE size=SCL_MM_HEAP2_SIZE attach="rwx" cache_policy="intnc_exwb" virtpool="virtual_mm_heap" physpool="physical_mm_heap" zero="false" />
#endif
<memsection name="cust_dmov" direct="true" virt_addr=CUST_DMOV_BUF_BASE size=CUST_DMOV_BUF_SIZE attach="rwx" cache_policy="uncachedbufferable" virtpool="virtual_cust_dmov_heap" physpool="physical_cust_dmov_heap" zero="false" />
#else /* IMAGE_APPS_PROC */
<snip>
Finally, rename your quartz_cfg_machine_<xxxxxx>A.xml, quartz_cfg_machine_<xxxxxxx>M.xml, quartz_cfg_<xxxxxxx>A.xml and quartz_cfg_<xxxxxx>M.xml and ensure that new files are created with the proper updates.
Question: Question on AMSS heap implementations
Answer: There are many small heaps in the system. Most of them are restricted for use by the individual modules that own them. However,
the main heaps are the static heap that q_malloc uses before the tmc_init is performed and the BREW/CS heap used later.
Pmem heap is a special heap reserved with 1:1 mapping and is uncached and is mainly used by drivers and multimedia. Please refer to pmem_62xx.c
for details. This heap can either be dynamically allocated from the free RAM after after all static allocations or statically allocated
depending on the particular release.
Question: On 6k targets, how many of them belong to BREW heap ?
Answer: Brew/CS uses the k1kheap whose size is defined as SCL_CS_KHEAP_SIZE.
Question: On 6k ltargets, how many of them belong to generic heap type?
Answer: The BREW/CS heap is the most commonly used generic heap.
Question: On 6k targets, is there other heap management mechanism except the above two ?
Answer: While BREW/CS has its own heap management mechanism, memheap.c is another mechanism that is used by AMSS heaps for heap management.
q_malloc has its own mechanism in qmalloc.c
Question: Can we modify the size of BREW heaps ? If so, how ?
Answer: SCL_CS_KHEAP_SIZE can be changed as appropriate.
Question: Can we add new generic heaps ? If so, how ?
Answer: Yes. You can add new heaps as necessary. Use a static array to define a heap buffer or allocate a buffer from an existing heap.
then you can use the functions in memheap.c to initialize and manage the heap.
Question: Can we modify the size of generic heaps ? If so, how ?
Answer: Most heaps are defined a static arrays and their sizes can be changed as necessary.
Question: How to reconstruct call stack?
Answer: There are multiple ways to check the call stack, detail is below
Detail:
This is complete procedure for Stack Recovery, which I want to share. I think most of the skills here might be applicable to HLOS as well.
1) Using T32 functionality
As all of you know, T32 can restore the call stack automatically if OS awareness module is there.
This is easiest way of doing it.
1.a) Install OS awareness, HLOS may have different procedure but for L4 and QuRT you can do this by below.
For QuRT(Blast)
TASK.CONFIG modem_proc\core\kernel\blast\install\debugger\Blast_model.t32
MENU.ReProgram modem_proc\core\kernel\blast\install\debugger\blast_model.men
For L4
TASK.CONFIG build\ms\l4.t32
MENU.ReProgram build\ms\l4.men
Please note the location of those files might vary by target.
1.b) You can type var.frame (v.f) and select task name, then you can see the call stack
-000|okl4_sys_notify_wait(asm)
-001|rex_wait(p_sigs = 4294967217)
-002|mcc_wait(requested_mask = 327681)
-003|mccidl_start_sleep(new_state = 515)
-004|mccidl_state(new_state = 515)
-005|mccidl_online(new_state = 515)
-006|cdma_idle()
-007|mcc_subtask(?)
-008|mc_online_digital_state(mmoc_trans_id_ptr = 0x4452A3F8)
-009|mc_cdma_prot_activate(act_reason = PROT_ACT_ONLINE_ACQ)
-010|mc_process_cmd()
-011|mc_task(?)
-012|rex_thread_init(entrypoint = 0x436D8371)
-013|UR:0x43C19AC8(asm)
-014|thread_delete_self()
---|end of frame
|
2) Using Stack Tracer.
However about procedure many not work if SP(Stack Pointer)/PC(Program Counter) are not known.
This is happening usually when the thread is in running state, as we save register values only when scheduling out.
We can use Stack Tracer for printing values which look like function calls.
1.a) However Before using this, you need to find out stack range.
1.a.1) You can find the range by symbol.brower (y.b)
symbol___________________|type_____________________|address_________________|
mc_SrchStrategyTrigger | | P:43053018--430530BB
mc_SrchStrategyUpdateDpo.| | P:43053CC4--43053D49
mc_stack |(rex_stack_word_type [20.| D:445287C0--4452A7BF
mc_state |(byte ()) | P:436D8416--436D8445
mc_store_watermark |(void ()) | P:436D8552--436D8553
assumption here is we should know the variable name for STACK of the task
1.a.2) otherwise if you know SP roughly, you can search symbol name by symbol.list (y.l)
In this case, if we already know SP is around 0x4452A278, then you can
y.l 0x4452A278
then you would see mc_stack[] is the name
___address___to_________|path\symbol_____________________________|type_____________________|scope_|location|
D:445287C0--4452A7BF|\\M8660AAABQNSZM3161B\Global\mc_stack |(rex_stack_word_type [20.|global|static |
D:4452A7C0--4452A9BB|\\M8660AAABQNSZM3161B\Global\dh_tcb |(rex_tcb_type) |global|static |
D:4452A9BC--4452B9BB|\\M8660AAABQNSZM3161B\Global\dh_stack |(rex_stack_word_type [10.|global|static |
For ARM,
Do staxtract5a.cmm MEMR <stack top> <stack bottom>
For QuRT,
Do staxtract5b_q6.cmm MEMR <stack top> <stack bottom>
You can find those script in \\ce-crash\tools\stack
You would get result similar to below. The list may have vogus functions as well. so you need to make it reasonable by checking source code.
Only red colored lines are valid. You need to filter out invalid function names from the list.
___ _ _ _____ ___
/ __| |_ __ _ __| |__ |_ _| _ __ _ __ ___ _ _ __ _| __| __ _
\__ \ _/ _` / _| / / | || '_/ _` / _/ -_) '_| \ V /__ \_/ _` |
|___/\__\__,_\__|_\_\ |_||_| \__,_\__\___|_| \_/|___(_)__,_|
SP: 0x4452A278 Lmt: 0x4452A278 Size: 0x547 0x0
0x0
Stack Start : 0x4452A7BF
Stack End : 0x4452A278
Stack Limit : 0x4452A278
Stack Size : 0x547
Stack Address Function Name(?)
=================================================================
0x4452A294 mcc_wait
0x4452A2AC mccidl_start_sleep
0x4452A2BC msg_save_3
0x4452A2D4 msg_send_3
0x4452A2FC mccidl_state
0x4452A314 q_check
0x4452A324 mccreg_update_lists
0x4452A33C mccreg_idle_check
0x4452A354 q_put
0x4452A374 mccidl_online
0x4452A394 cdma_idle
0x4452A3A4 mcc_subtask
0x4452A3BC mc_online_digital_state
0x4452A3F4 mc_cdma_prot_activate
0x4452A40C mc_process_cmd
0x4452A434 time_genoff_get_optimized
0x4452A46C timetick_sclk64_get
0x4452A494 time_genoff_get_optimized
0x4452A4B4 time_genoff_opr
0x4452A4D4 osal_atomic_compare_and_set
0x4452A4EC err_f3_trace_secure_offset
0x4452A79C mc_task
Finishing the analysis
3) Using T32 command
If both (1) (2) not working properly, there is a T32 command to help us reconstruct stack frame in manual way
DATA.VIEW %SYMBOL.LONG <address of stack>
This command basically prints out all the values around the address, with symbol information.
So we can pick only reasonable function calls from the window.
You will get result similar to below.
DATA.VIEW %SYMBOL.LONG 0x4452A278
breakpoint________address|_data_______|value___________|symbol
UD:4452A270| 1C 61 A3 45 45A3611C \\M8660AAABQNSZM3161B\Global\mc_stack+0x1AB0
UD:4452A274| 8D 8A 6C 43 436C8A8D
UD:4452A278| 00 00 00 00 0 \\M8660AAABQNSZM3161B\rexl4\rex_wait+0x7F
UD:4452A27C| 70 8F 45 44 44458F70
UD:4452A280| 00 00 00 00 0 \\M8660AAABQNSZM3161B\Global\mc_mode_controller_state
UD:4452A284| 00 00 00 00 0
UD:4452A288| 01 00 05 00 50001
UD:4452A28C| 00 00 00 00 0
UD:4452A290| 00 00 40 00 400000
UD:4452A294| E7 4D 3D 43 433D4DE7
UD:4452A298| 05 00 05 00 50005 \\M8660AAABQNSZM3161B\mccdma\mcc_wait+0xFF
UD:4452A29C| 03 02 00 00 203
UD:4452A2A0| 3C D8 4A 44 444AD83C
UD:4452A2A4| 00 00 00 00 0 \\M8660AAABQNSZM3161B\mccidl\mccidl_fast_raho
UD:4452A2A8| 01 00 00 00 1
UD:4452A2AC| 3F 45 23 44 4423453F
UD:4452A2B0| 00 00 00 00 0 \\M8660AAABQNSZM3161B\mccidl\mccidl_start_sleep+0x97
UD:4452A2B4| 00 00 00 00 0
UD:4452A2B8| 00 00 00 00 0
UD:4452A2BC| 43 31 00 43 43003143
UD:4452A2C0| 00 00 00 00 0 \\M8660AAABQNSZM3161B\msg_api\msg_get_time+0x1
UD:4452A2C4| 00 00 00 00 0
UD:4452A2C8| 00 00 00 00 0
UD:4452A2CC| 00 00 00 00 0
UD:4452A2D0| 00 00 00 00 0
UD:4452A2D4| A5 33 00 43 430033A5
UD:4452A2D8| 00 00 00 00 0 \\M8660AAABQNSZM3161B\msg_api\msg_send_3+0x41
UD:4452A2DC| 00 00 00 00 0
UD:4452A2E0| 00 00 00 00 0
UD:4452A2E4| 00 00 00 00 0
UD:4452A2E8| A8 64 01 45 450164A8
UD:4452A2EC| 03 02 00 00 203 \\M8660AAABQNSZM3161B\Global\cdma+0x3E4
UD:4452A2F0| 02 00 00 00 2
UD:4452A2F4| A8 64 01 45 450164A8
UD:4452A2F8| 00 00 00 00 0 \\M8660AAABQNSZM3161B\Global\cdma+0x3E4
UD:4452A2FC| 83 54 23 44 44235483
UD:4452A300| 03 02 00 00 203 \\M8660AAABQNSZM3161B\mccidl\mccidl_state+0x67D
UD:4452A304| 02 00 00 00 2
UD:4452A308| 90 ED 01 45 4501ED90
4) Restore Call stack manually
This is most time consuming procedure, and tricky, too. But this is last method we have if none of above worked.
You can browse whole stack area, and find possible LR values, and match related source code to check the LR is reasonable or not.
By that, you could get reasonable call stack from any situation.
Reconstructing stack frame in ARM and Q6 might be different.
detail information on this procedure is documented in “Stability debugging Guide” doc# 80-VN752-2 already. So I would skip this part.
Question: Why cant I see the L4 thread list/task list/Thread switch list in trace32
Answer: The L4 trace32 extension provides all of these features in trace32. Make sure that the menu is loaded and turned on.
Details: Every L4 kernel is built with the symbols defined. The same can be loaded using loadsyms.cmm script in the build/ms folder.
Once this is done, please ensure that L4.T32 and L4.men files are present in the build/ms directory. To reconfigure the menua nd turn on the
extension , runt he following commands on the Trace32 command window.
Menu.reprogram L4.men
Task.config L4.t32
Document 80-VH783-1 gives more details about the extension itself.
Questinon : How to debug heap corruption issues
Question: How do I detect and debug heap corruption issues
Answer:
This article refers to debugging of heap corruption and *not* heap exhaustion. The two are quite different. Heap exhaustion occurs when some module requests too much memory without freeing it.An example of heap corruption on the other hand is when a module requests a small amount of memory and uses more that it requested, as a result corrupting the rest of the heap structure around its allocated segment.
There is no set way to detect a heap corruption. Crashes often manifest as random issues and the task that causes the corruption is not necessarily the one that crashes. We usually need multiple memory dumps to diagnose these problems.A good indicator is if you see malloc\free calls failing then this may point to a heap error.
One quick way to check if a heap is corrupted is to check its structure(eg: for tmc_heap the memory is allocated from tmc_heap_mem_buffer, but the information regarding the heap(pointers to blocks,size allocated,etc) is stored in a structure called tmc_heap.
You can access it by v.v tmc_heap:
tmc_heap = (
first_block = 0x06748040,
next_block = 0x067482B0,
total_blocks = 0x4,
total_bytes = 0x00068000,
used_bytes = 0x0237,
max_used = 0x0237,
max_request = 0xFF,
fail_fnc_ptr = 0x0,
lock_fnc_ptr = 0x0499D183,
free_fnc_ptr = 0x0499D18B)
If there is an issue with the heap then usually some of these fields will not make sense.
There are two basic ways to debug these sorts of issues:
1) In case the malloc\free wrapper is getting passed a bad size value\pointer and calling these is what causes the corruption:
- Add some sanity code in the call to check the state of the heap after the call is done. For example if the issue is in the tmc_heap the code
could be something like
mem_malloc(){............../*DEBUG CODE*/
if(heap_ptr == &tmc_heap_small || heap_ptr == &tmc_heap)
{
/*Check health of both heaps*/
ASSERT((tmc_heap_small.used_bytes >= 0) && (tmc_heap_small.used_bytes <= TMC_HEAP_SMALL_MEM_BUFFER_SIZE))
ASSERT((tmc_heap.used_bytes >= 0) && (tmc_heap.used_bytes <= TMC_HEAP_MEM_BUFFER_SIZE))
}
return NULL;
}
A similar check can also be added to the free function.
2) Case where malloc\free works fine bun some task goes beyond its allocated bound and scribbles over memory that does not belong to it.
- In these cases one debug method is to declare "pad buffers" around the heap data structures and set write breakpoints over their ranges.
Since nobody should be normally accessing them, any write to them is an error. Once the breakpoint is hit you can see the callstack to
check who is the culprit
mem_heap_type tmc_heap;
mem_heap_type tmc_heap_small;
uint32 pad_buffer_1[128];<------------
uint8 tmc_heap_mem_buffer[TMC_HEAP_MEM_BUFFER_SIZE];
uint32 pad_buffer_2[128];<--------------
uint8 tmc_heap_small_mem_buffer[TMC_HEAP_SMALL_MEM_BUFFER_SIZE];
uint32 pad_buffer_3[128];<--------------
After compiling please load the elf into the simulator and confirm that the pad buffers are in the locations we want them to be(You can check through the symbol browser).
Question: How to define OEM_STATIC_HEAP_SIZE in QSC6085
Answer: In QSC6085, you could calculate the max value of OEM_STATIC_HEAP_SIZE as the following:
total memory size -(Code+RO+RW+ZI) - L4 heap - PMEM.
If OEM_STATIC_HEAP_SIZE could not be defined to the previous normal value, that means the physical memory size is not configured correct.
You should check row size, col size in SDRAM_DEV_PARAM_CFG1, which is defined in cfg_data and ddr_cdc_cal_data. Also check
QSC6055_MEMORY_0_SIZE in memory_map_full.h and ram size configure in targsn.h, quartz_cfg_N.xml.