PDA

View Full Version : What is the real maximum string length?



tufofi
2011-10-04, 03:41 PM
What is the real maximum string length?

Some documentation says 100 characters, some says 256, and I just "strcat" -ed a 450 character string.

Anybody have the true scoop on string length limitations and repricussions in AutoLISP/CAD's environment?

Thanks

Doug

Ed Jobe
2011-10-04, 03:54 PM
Are you talking about a TEXT object, MTEXT object, or a lisp symbol?

tufofi
2011-10-04, 06:07 PM
Sorry, as a LISP symbol specifically, but a clear description of the actual limitations would be nice to know. in all it's places.

dgorsman
2011-10-04, 06:53 PM
Which documentation gives the maximum string lengths you mentioned? I've seen a number of non-LISP references indicate 80 characters (ISOGEN), 256 (Win2000 registry REG_SZ entry), and a few others for various applications but nothing specificaly for LISP. I've run LISP string values in the 1000+/- character region before problems start to crop up, but I can't tell if thats from the string value itself or where the value is being sent to.

watsonlisp
2011-10-04, 09:00 PM
;This autolisp program i wrote to test string length.
;I let it run up to 700,000 charaters then canceled it.
;if someone wants to crash autocad and let it run longer, let me know what the results were.




(DEFUN C:SLT ()
(PROMPT "\n *STRING LIST TEST* ")
(SETQ HS "X")
(SETQ CT 1)
(SETQ LP 1)
(WHILE LP
(SETQ HS (STRCAT HS "X"))
(SETQ CT (+ CT 1))
(PROMPT "\n CHARACTER # ")
(PRINC CT)
);END LP
);END SLT

Ed Jobe
2011-10-04, 09:12 PM
;This autolisp program i wrote to test string length.
;I let it run up to 700,000 charaters then canceled it.
;if someone wants to crash autocad and let it run longer, let me know what the results were.




(DEFUN C:SLT ()
(PROMPT "\n *STRING LIST TEST* ")
(SETQ HS "X")
(SETQ CT 1)
(SETQ LP 1)
(WHILE LP
(SETQ HS (STRCAT HS "X"))
(SETQ CT (+ CT 1))
(PROMPT "\n CHARACTER # ")
(PRINC CT)
);END LP
);END SLT


Are you assuming that you would get some kind of error to stop the loop? I don't see anything to change the value of LP. What if additional attempts to use (strcat) simply have no effect on the contents of HS? Looks like an endless loop.

Since a lisp symbol is a named pointed to some value, it depends on the data type of the value stored in the symbol. Here is the desc of an ActiveX String data type.

There are two kinds of strings: variable-length and fixed-length strings.


A variable-length string can contain up to approximately 2 billion (2^31) characters.
A fixed-length string can contain 1 to approximately 64K (2^16) characters.

watsonlisp
2011-10-04, 10:36 PM
;ok ed, i edited the program to be based on STRLEN and ran it to 700,000 again and it still works.
; who wants to crash autocad with it?




(DEFUN C:SLT ()
(PROMPT "\n *STRING LIST TEST* ")
(SETQ HS "X")
(SETQ LP 1)
(WHILE LP
(SETQ HS (STRCAT HS "X"))
(SETQ CT (STRLEN HS))
(PROMPT "\n CHARACTER # ")
(PRINC CT)
);END LP
);END SLT

irneb
2011-10-05, 12:55 PM
Here's a much "quicker" test:
(defun c:TestStrLen (/ s)
(setq s "X")
(while t
(setq s (strcat s s))
(princ "\n")
(princ (strlen s))
)
)Doubling the string's length each time, and deliberately going through an infinite loop. Running it on my Vanilla 2011 on WinXP 32bit ACad crashes after displaying a strlen of 268435456.

So I'm guessing it's governed by the 32bit integer maximum of 4294967295, or very close to that. Perhaps someone could test the same on 64bit (don't have one available just now).

As for the rest of the OP's questions. Acad has several "limits" on string lengths. Here's a few:


Layer, block & style names: if ExtNames (http://docs.autodesk.com/ACD/2011/ENU/filesACR/WS1a9193826455f5ffa23ce210c4a30acaf-4fe5.htm)=0 then these may only be 31 characters long. Otherwise 255 and may also include some extra characters which are not allowed otherwise.
Linetype definitions have some weird stuff as well: http://www.upfrontezine.com/tailor/tailor15.htm
Text (DText) objects have a maximum character length of 255. This comes from the limitation in DXF of only allowing 255 characters per DXF item and and TEXT only having one DXF-code-1 item. Note the limit includes special characters, e.g. the %%c for the diameter character counts as 3 characters (not one).
MText has effectively no limit, though it seems around 32000 http://forums.augi.com/showthread.php?t=37010. This is because MText (in addition to the DXF-code-1) may contain an "unlimited" number of code-3's. Thus it breaks the length into chunks of 250 each. http://docs.autodesk.com/ACD/2011/ENU/filesDXF/WS1a9193826455f5ff18cb41610ec0a2e719-79f8.htm

irneb
2011-10-05, 01:25 PM
Actually just realized I'm not counting the numbers correctly. That 268435456 is actually 256 MB. At first I though it's something to do with RAM size, as my PC only has 3GB. But that 256 sounds a lot more like a hard-wired limit to me.

Ed Jobe
2011-10-05, 03:17 PM
;ok ed, i edited the program to be based on STRLEN and ran it to 700,000 again and it still works.
; who wants to crash autocad with it?
Aaron, I think you missed my point. I didn't say anything about STRLEN. That line is just part of reporting to the user. The strcat function does the real work of the loop, by adding X's on each iteration. Inerb sped up the loop by doubling the characters added each loop. You could have added 100 or 1000 X's. 700k iterations isn't close to the limit.

What I was referring to is that you've got LP set to 1 and then use (While LP). Nothing in your code changes the value of LP, so you've got an infinite loop. Crashing acad is not a very elegant way to get out of a loop...unless that was your goal as was the case with Inerb's example. Hence, my question was if you were expecting an error to get you out of the loop.

Ed Jobe
2011-10-05, 03:19 PM
Actually just realized I'm not counting the numbers correctly. That 268435456 is actually 256 MB. At first I though it's something to do with RAM size, as my PC only has 3GB. But that 256 sounds a lot more like a hard-wired limit to me.
I'm surprised there's that much difference between a lisp string and a vba string. I haven't tested it though.

irneb
2011-10-05, 03:58 PM
Those VBA limits are due to the way the strings are stored. A variable length string is stored much like a linked list - thus your "true" limit is your RAM. And since VBA only works in 32bit mode, the 2GB (31 bit address - as imposed by windows) is where you stop having RAM available for any one process (even if the PC has more RAM installed).

The 16bit (64k) is because fixed length strings are stored as arrays. For some reason the OS doesn't allow larger banks of consecutive memory allocation :shock: - so a larger array is impossible without turning it into some form of hybrid-linked-list (similar to the idea behind MText's 1 and 3 codes).

Lisp on the other hand makes heavy use of pointers (i.e. linked-lists), but there's added requirements as well - due to the garbage collector needing to know if ram is still pointed to by some symbol. That 256MB limit seems as if the lisp compiler/interpreter doesn't allow larger lists to be generated. Actually I'm not sure how strings are stored in lisp, it might be that it's so inefficient as to use 8 times as much space as truly required -> 256MB x 8 = 2048MB = 2GB

watsonlisp
2011-10-05, 07:42 PM
thanks ed and Inerb,
using strlen to add S to S is a little inaccurate.
Adding 26 million characters to 26 million missed maybe a few million.
But an interesting approach.

watsonlisp
2011-10-05, 08:18 PM
;Ok guys this one increments by 100, change it to 10000 or 100000 if you like.




(DEFUN C:SLT ()
(PROMPT "\n *STRING LIST TEST* ")
(SETQ CT2 0)
(SETQ HS2 "")
(SETQ LP2 1)
(PROMPT "\n Building string increment... ")
(WHILE LP2
(SETQ HS2 (STRCAT HS2 "X"))
(SETQ CT2 (+ CT2 1))
(IF (= CT2 100) (SETQ LP2 NIL)); change 100 to any increment you like!
);END LP2
(SETQ HS "")
(SETQ LP 1)
(WHILE LP
(SETQ HS (STRCAT HS HS2))
(SETQ CT (STRLEN HS))
(PROMPT "\n CHARACTER # ")
(PRINC CT)
);END LP
);END SLT

watsonlisp
2011-10-05, 08:57 PM
;Just for the heck of it, i added an input for increment :)






(DEFUN C:SLT (/ IL CT2 LP2 HS2 HS LP CT)
(PROMPT "\n *STRING LIST TEST* ")
(SETQ IL (GETREAL "\n Enter increment #: "))
(SETQ CT2 0)
(SETQ HS2 "")
(SETQ LP2 1)
(PROMPT "\n Building string increment... ")
(WHILE LP2
(SETQ HS2 (STRCAT HS2 "X"))
(SETQ CT2 (+ CT2 1))
(IF (= CT2 IL) (SETQ LP2 NIL))
);END LP2
(SETQ HS "")
(SETQ LP 1)
(WHILE LP
(SETQ HS (STRCAT HS HS2))
(SETQ CT (STRLEN HS))
(PROMPT "\n CHARACTER # ")
(PRINC CT)
);END LP
(PRINC)
);END SLT

irneb
2011-10-06, 07:20 AM
Well seeing as you already know that 256MB of characters work fine, why not rather start from that?

E.g.
(defun DoubleStringN (N / s)
(setq s "X")
(repeat N (setq s (strcat s s)))
s
)

(defun Get256mbString (/)
(DoubleStringN 28)
)

BTW, it seems it's not a situation of RAM being the factor. It's more like a single lisp symbol may only have a certain maximum RAM usage. Using that code I kept my Task manager open. After starting ACad and opening VLIDE and loading the above the acad.exe process was using 342 548 KB.

Then I ran the following at in the Visual Lisp Console (comments added to show ram usage of acad.exe process):
_$ (strlen (setq s (Get256mbString)))
268435456 ;606 244 KB --> 263 696 KB difference = 257.5 MB
_$ (strlen (setq s2 (Get256mbString)))
268435456 ;870 608 KB --> 264 364 KB difference = 258.2 MBSo obviously I could have 2 separate strings each 256 MB in length in RAM at he same time.

Now to get that "exact" size a string may be, I could still do the following:
_$ (strlen (setq s (strcat (DoubleStringN 28) (DoubleStringN 27))))
402653184
Then I added 2 other functions:
(defun SaveStrLen (s / f)
(setq f (open "C:\\strlen.txt" "w"))
(princ (strlen s) f)
(close f)
)

(defun IncStringN (S N /)
(repeat N
(SaveStrLen s)
(setq s (strcat s "X"))
)
s
)Saving the StrLen to a text file so it stores the last working result. Then I ran:
(strlen (IncStringN s (* 256 1024 1024)))After the Out of Memory error occurred in VLIDE the file shows 402 653 184 bytes = 393 216 KB = 384 MB = 256 MB + 128 MB = (2^28 + 2^27) bytes (exactly). So there you have it!

Instead of leaving it for days to increment one character at a time, this took me all of 20 minutes - including writing all the code. Simple idea of divide and conquer. ;)

irneb
2011-10-06, 07:40 AM
BTW, I'm sure the alloc (http://docs.autodesk.com/ACD/2011/ENU/filesALR/WS1a9193826455f5ff1a32d8d10ebc6b7ccc-6aba.htm) and expand (http://docs.autodesk.com/ACD/2011/ENU/filesALR/WS1a9193826455f5ff1a32d8d10ebc6b7ccc-6a2e.htm) functions would have some effect on this figure. Haven't tested it though. That I'll leave to someone else :mrgreen:

That might also be why you see the "8" times discrepancy, as the lisp breaks its available memory into chunks for symbols, strings, usubrs (defuns), reals and conses (lists).

Lee Mac
2011-10-06, 10:54 AM
Interesting investigation Irne, thanks for your time.

watsonlisp
2011-10-06, 10:17 PM
thanks for giving it a shot Inerb!
i was worried about going into swap when the ram ran out then filling up my hard drive :)

irneb
2011-10-07, 04:49 AM
You're welcome guys! It's been fun. ;)