none
Check file naming convention by custom pattern

    Question

  • I will be having lot of files incomming into to my server and have to check their convention names accordingly to what will be agreed for each particural file. I am looking for pattern which could be checking them. For example:

    SIR 2016-12.xlsx
    Silnik_5_30_cos_26122017.xlsx
    DIDO 24_05_2017.xlsx
    OMO 12-30.csv
    MOK 5_3_17.csv
    'others unknown file naming conventions at the moment

    From what i know the dynamic part will be always date for each files (the diffrence could be with format of date/datetime/time). What i have on top of my mind is to have config file and declare expecting patterns for each of files using brackets and put dynamic date/datetime/time) as follows:

    'within <> are dynamic part
    
        RTO TR <YYYY-MM>.xlsx           
        Engine-FR_<YYYY_MM_DD>.csv      
        SIR <YYYY-MM>.xlsx
        Silnik_5_30_cos_<DDMMYYYY>.xlsx
        DIDO <DD_MM_YYYY>.xlsx
        OMO <hh-mm>.csv
        MOK <d_M_YY>.csv

    Afterwords have "universal checker" which could check/decode specific incomming filename and supervise if passes the agreed convention with pattern. What could you propose and how you would accomplish that. Examples appreciated. If possible not using regular expression. Note that i am not aware about all different incoming files as it will be comming on project step therefore good to have solution opened for new filenames.


    • Edited by JimmyJimm Wednesday, April 5, 2017 6:49 PM
    Wednesday, April 5, 2017 5:12 PM

All replies

  • I will be having lot of files incomming into to my server and have to check their convention names accordingly to what will be agreed for each particural file. I am looking for pattern which could be checking them.

    Your templates would actually look like this:

        *<YYYY-MM>.xlsx           
        *<YYYY_MM_DD>.csv      
        *<DDMMYYYY>.xlsx
        *<DD_MM_YYYY>.xlsx
        *<hh-mm>.csv
        *<d_M_YY>.csv

    where * means 'any number of letters, digits or special characters'. You might need other codes to indicate other rules, such as '?' to indicate 'exactly one letter, digit or special character' and you might need to distinguish between 'any number' (which includes none) and 'at least one'.   It all depends on what the rules are.

    Your 'universal checker' then parses the templates and confirms the filename against the template.

    Why wouldn't you use regular expressions?  Then the regular expression is the template, and no additional parsing is required in the universal checker. You would create a class that includes the regular expression, a description, and whatever else you need to know about the template.  Store it as a list of those class instances.

    It would be easier to build a regular expression build tool (to enable any user to describe their file name as a regex, and add it to the list) than it would be to build your own template parser.

    Wednesday, April 5, 2017 9:47 PM
  • Hi JimmyJimm,

    It seems that Acamar's reply resolve your issue, if so, please mark it as answer, it will be beneficial to other communities who have the same issue.

    Thanks for your understanding.

    Best Regards,

    Cherry Bu



    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, April 12, 2017 9:37 AM
    Moderator
  • Here is an idea.  This requires that you write a small piece of code for each 'format'.  I do think acamar is correct about using regex...

        Public Class FileNameChecker
            Public Shared Function CheckFileName(fileName As String) As Boolean
                Dim rv As Boolean = True
                Dim fn As String = fileName.ToLower
                fn = IO.Path.GetFileNameWithoutExtension(fn)
    
                Select Case True
                    Case fn.StartsWith("rto tr ")
                        fn = fn.Replace("rto tr ", "")
    
                    Case fn.StartsWith("engine-fr_")
                        fn = fn.Replace("engine-fr_", "")
    
                    Case fn.StartsWith("sir ")
                        fn = fn.Replace("sir ", "")
                        If Not DateTime.TryParseExact(fn, "yyyy-MM", Globalization.CultureInfo.CurrentCulture, Globalization.DateTimeStyles.None, Nothing) Then
                            rv = False
                        End If
    
                    Case fn.StartsWith("silnik_5_30_cos_")
                        fn = fn.Replace("silnik_5_30_cos_", "")
                        If Not DateTime.TryParseExact(fn, "ddMMyyyy", Globalization.CultureInfo.CurrentCulture, Globalization.DateTimeStyles.None, Nothing) Then
                            rv = False
                        End If
    
                    Case fn.StartsWith("dido ")
                        fn = fn.Replace("dido ", "")
    
                    Case fn.StartsWith("omo ")
                        fn = fn.Replace("omo ", "")
    
                    Case fn.StartsWith("mok ")
                        fn = fn.Replace("mok ", "")
    
                    Case Else
                        rv = False
    
                End Select
    
                Return rv
            End Function
        End Class
    
    'testing
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Dim testFN As New List(Of String) From {"SIR 2016-12.xlsx", "Silnik_5_30_cos_26122017.xlsx", "DIDO 24_05_2017.xlsx", "OMO 12 - 30.csv", "MOK 5_3_17.csv"}
    
            For Each fn As String In testFN
                Dim ok As Boolean = FileNameChecker.CheckFileName(fn)
            Next
        End Sub
    I did a few so you could see how.  Good luck.


    "Those who use Application.DoEvents() have no idea what it does and those who know what it does never use it." - MSDN User JohnWein    Multics - An OS ahead of its time.


    • Edited by dbasnett Wednesday, April 12, 2017 11:13 AM
    Wednesday, April 12, 2017 11:12 AM