none
Datatypes

    Question

  • Just getting started with SQL.  I have reached the point where I can design a database in SQL (marginally) and have figured out that for most of my needs a CHAR or VCHAR will fit most of my non-numeric needs.  However, what I am not getting from the mounds of information I have been perusing is the best practice in usage.

    For instance, a CHAR has a maximum size of 8,000 characters.  Does this mean that every record I create will have 8000 bits (I think that I read that each character is 1 bit, but could be wrong) set aside in memory?  Or is just the amount of that 8000 characters used in memory?

    Should I set the size where I know that the data in the attribute will be less than some number n?

    I guess what I am wanting to know, is what is the memory cost of using CHAR/VCHAR and how do I ensure that I am getting the lowest cost and value out of my usage.


    gwboolean

    Friday, February 09, 2018 7:01 PM

Answers

  • Sam, thanks for the information.  However, I thought I was clear about what I am not clear about (excellent oxymoron).  My confusion centers around (n).  I feel that it is important for me to fully understand what is going on with that.

    So, for CHAR if n = 100, my understanding is that the attribute would set aside 100 characters worth of memory for use and that any additional characters input would be truncated.  Additionally, any characters worth of memory not used would be filled with padding, thus using all 100 characters worth of memory.

    For VCHAR, if n = 100, my understanding is that the attribute would allow up to 100 characters worth of memory for use,  and any additional characters would be truncated (or worse). However, any characters worth of memory not used (down to some minimum) would not be padded, i.e. if only 50 characters were used then then only 50 characters worth of memory would be used.


    gwboolean

    That is correct except for variable length there is additional data for the length that is a fixed amount but I don't know how much precisely; it does not matter to me


    Sam Hobbs
    SimpleSamples.Info

    • Marked as answer by gwboolean Saturday, February 10, 2018 12:12 AM
    Friday, February 09, 2018 10:23 PM

All replies

  • you can refer this discussion for some pros and cons of both

    https://social.msdn.microsoft.com/Forums/sqlserver/en-US/47bfe0ed-daf0-4cd3-bfbd-ca0c17044853/which-is-best-in-performance-char-or-varchar?forum=sqlgetstarted


    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Friday, February 09, 2018 7:12 PM
    Moderator
  • IMHO, you should only use char when you know you have a fixed size.  For example, State abbreviation, SSN, etc.

    If you know the normal size of a string, you should define it.  Char is faster read.

    Friday, February 09, 2018 7:36 PM
    Moderator
  • So here is most of what I got out of the references.

    CHAR(n)  

    n Bytes are set aside no matter how much of n is actually used.

    So what occurs with my data if the length of characters is n + 5?  Are the +5 bytes/characters cut-off (I believe that is what occurs, but it is not entirely clear)?


    gwboolean

    Friday, February 09, 2018 7:38 PM
  • It depends.  In some cases, it will be truncated, in others you will get an error.   In no case will it expand the database field >n.

    Friday, February 09, 2018 7:40 PM
    Moderator
  • Another question on VCHAR. 

    So what I get out of VCHAR is that it sets aside n memory VCHAR(n), but that up to 8000 characters are available for data (not including MAX).  Is that correct?

    Additionally, if I wanted to sort or filter on that field then VCHAR would be preferred?


    gwboolean

    Friday, February 09, 2018 8:00 PM
  • You should find some good books, at least one, with the fundamentals. People that volunteer to help others are reluctant to provide fundamentals that are covered extremely well in books and articles and such.

    Characters consist of 1 or more bytes. A byte has 8 bits. that would be explained in all beginner books.

    Yes, if you create a field with 8000 characters then either 8000 bytes or 16000 bytes would be used. If the field uses Unicode characters then it would be 16000 bytes, 128000 bits. Any data that is less than 8000 characters will be "padded" (filled) with spaces to the right.



    Sam Hobbs
    SimpleSamples.Info

    Friday, February 09, 2018 8:04 PM
  • I understand about the books and the unwillingness.  I do have books and reference available to me and what I was not clear on was the meaning of (n) for CHAR and VCHAR. 

    My reading tells me, very unclearly that in the case of CHAR(n), n is the number of bytes/characters that will be set aside in memory regardless of the number of bytes actually input.  Additionally I believe that CHAR fields are faster to read than VCHAR, although I am not sure what that means unless that is a decreased processing time.

    With VCHAR my reading tells me, again very unclearly, that VCHAR(n), n is the maximum number of characters/bytes available for data, but that only a fixed small number of bytes are set aside in memory if the attribute contains no data.  Additionally, I believe that the CHAR fields might sort or filter more efficiently than CHAR fields.


    gwboolean

    Friday, February 09, 2018 8:18 PM
  • First, I think that the number of bytes is not relevant here. We can just talk about the number of characters.

    Conceptually, a variable-length field would consist of a length and the data. So for example "SQL Server" would consist of the number 10 plus the  10 characters. The length is fixed in size, 2 bytes would be enough for the number 8000 but I don't know what SQL Server actually uses. A fixed-length field would always be the specified size.

    Things like the performance of comparison other than the obvious affect of the size are likely insignificant. Programmers often consider processor performance as the only consideration but programmer efficiency is important too. Fixed-length fields might be slightly more efficient but the difference is not important enough to be concerned about.

    For the future, you will get better results if you provide a link tot the documentation and explain what is not clear. The efficiency of the volunteers trying to help you is a very important consideration and communicating the fact that you have tried to find the answer can make a big difference.



    Sam Hobbs
    SimpleSamples.Info


    Friday, February 09, 2018 8:45 PM
  • Sam, thanks for the information.  However, I thought I was clear about what I am not clear about (excellent oxymoron).  My confusion centers around (n).  I feel that it is important for me to fully understand what is going on with that.

    So, for CHAR if n = 100, my understanding is that the attribute would set aside 100 characters worth of memory for use and that any additional characters input would be truncated.  Additionally, any characters worth of memory not used would be filled with padding, thus using all 100 characters worth of memory.

    For VCHAR, if n = 100, my understanding is that the attribute would allow up to 100 characters worth of memory for use,  and any additional characters would be truncated (or worse). However, any characters worth of memory not used (down to some minimum) would not be padded, i.e. if only 50 characters were used then then only 50 characters worth of memory would be used.


    gwboolean

    Friday, February 09, 2018 9:37 PM
  • Sam, thanks for the information.  However, I thought I was clear about what I am not clear about (excellent oxymoron).  My confusion centers around (n).  I feel that it is important for me to fully understand what is going on with that.

    So, for CHAR if n = 100, my understanding is that the attribute would set aside 100 characters worth of memory for use and that any additional characters input would be truncated.  Additionally, any characters worth of memory not used would be filled with padding, thus using all 100 characters worth of memory.

    For VCHAR, if n = 100, my understanding is that the attribute would allow up to 100 characters worth of memory for use,  and any additional characters would be truncated (or worse). However, any characters worth of memory not used (down to some minimum) would not be padded, i.e. if only 50 characters were used then then only 50 characters worth of memory would be used.


    gwboolean

    That is correct except for variable length there is additional data for the length that is a fixed amount but I don't know how much precisely; it does not matter to me


    Sam Hobbs
    SimpleSamples.Info

    • Marked as answer by gwboolean Saturday, February 10, 2018 12:12 AM
    Friday, February 09, 2018 10:23 PM
  • For sorting and filtering there is potential incompatibility between fixed and variable sizes since by default data padded with spaces is not equal to the same data without additional padding.


    Sam Hobbs
    SimpleSamples.Info


    Friday, February 09, 2018 10:27 PM
  • For what it is worth, I have been familiar with the concept of variable-length records and data from before the IBM PC (and therefore before Windows) existed.


    Sam Hobbs
    SimpleSamples.Info

    Friday, February 09, 2018 10:31 PM
  • Thanks Sam

    That was what I was wanting to know.


    gwboolean

    Saturday, February 10, 2018 12:14 AM
  • There is an additional 2 bytes att thhe row level to keep track of the length of for varchar - which isn't needed for char. But in reality you never bother about that. If the data is of variable length by nature, you use varchar. If it is fixed length by nature, you use char.

    Aside from the storage aspect (what is on disk, in the database pages), you also want to consider how much SQL Server be prepared to handle in the execution plans (for various types of temp memory while processing a query). Here is can be a real difference between, say char(50) and char(2000). In the former case, SQL Server will *know* that it can allocate buffers and this data won't be more than 50 bytes). I.e., never overallocate your varchars "just in case".


    Tibor Karaszi, SQL Server MVP (Web Blog)

    Sunday, February 11, 2018 3:00 PM
  • For sorting and filtering there is potential incompatibility between fixed and variable sizes since by default data padded with spaces is not equal to the 

    Can you elaborate on what you mean? AFAIK, trailing spaces are ignored for these purposes in SQL Server.

    DECLARE @Char char(10) = 'ABC';
    DECLARE @VarChar varchar(10) = 'ABC';
    SELECT 'equal' WHERE @Char = @VarChar;


    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Sunday, February 11, 2018 3:36 PM
    Moderator