Najpopularnija metoda autentikacije korisnika u informacijskim sustavima jest primjena lozinki. Svaka lozinka koju napadač ili maliciozni kod može jednostavno pogoditi slaba je i sustav čini ranjivim. Administratori sustava i analitičari za sigurnost koriste metode pronalaženja slabih lozinki kako bi mogli proaktivno štititi informacijski sustav. Prilikom analize sustava u kojem se želi otkriti slabe lozinke, uvijek je poželjnija metoda koja uz manji utrošak vremena i računalnih resursa pronalaziveći broj slabih lozinki. Ovaj rad opisuje istraživački postupak kojim su razvijene tri metode za pronalaženje slabih lozinki. Prva metoda omogućava pronalaženje slabih, industrijski postavljenih lozinki, na uređajima koji su dostupni na mrežama ili Internetu, i stvaralistu industrijskih i čestih lozinki iz raznih online repozitorija. Druga metoda koristi samostalno odabrane ključne riječi pomoću kojih generira listuriječi koristeći tražilicu Google. Ključne riječi mogu biti imena osoba, nadimci, imena tvrtki i slično, što omogućava da liste riječi sadrže pojmove povezane s ključnim riječima.Treća metoda omogućava pronalaženje slabih lozinki koje su zaštićene jednosmjernom kriptografskom funkcijomi temelji se na modeliranju korisničkog ponašanja prigodom odabira lozinke. Istraživanje korisničkog ponašanja provedeno je kroz razvoj i oblikovanje nove metode koja predstavlja novi algoritam strojnog učenja, koji stvara pravila koja opisuju kako korisnici kreiraju svoje lozinke. Metoda strojnog učenja primijenjena je na listi lozinki koje su bile dostupne za potrebe istraživanja. Pravila koja su pronađena uz pomoć treće metode mogu biti primijenjena u skoro svim popularnijim alatima za testiranje snage lozinki. Primjena takvih pravila pokazala se bržom od danas referentnih metoda za testiranje snage lozinki, pri čemu je metoda opisana u sklopu ovog rada otkrila veći broj slabih lozinki u kraćem vremenskom intervalu u odnosu na referentnu metodu
|Abstract (english)|| |
The essence of this researchis related to the detection of weak passwords that can be guessed by an attacker or malware. Such passwords are a security vulnerability because they can provide a possibility of unauthorized access toa system. For password analysis, we can distinguish between online analysis where we try to identify weak passwords that are active on an authentication system that is accessible from the local networkor from the public Internet. A large number of devices are vulnerable, since they are set with factory default password sand are accessible via the Internet, where a large number of malware and automated scanners try to guess the password on those systems with malicious intent. If the passwords are not changed from their default values they area major risk to the information system, because network facing systems are critical systems for data transmission. In the case of online analysis, we want to check a small number of passwords against a network service, so we dont affect the performance of systems that are already under constant attack by automated tools or malicious code from the Internet. Another type of analysis is the offline analysis where we analyzea list of passwords that are used by anauthentication system where each password is protected with a one-way hash function. In this case, our analysis is limited by our processing resources, where we want to spend as little resources as possible and detect as many weak passwords in the shortest timeframe possible. Best practices as described in international norms ISO/IEC 27002:2013, NIST 800-115 and NIST 800-63-1 state that passwords need to be complexand protected inthe way that we dont handle or store the plaintext value of a password. The protection is usually with the use of a one-way cryptographic hash function.This is the reason why we need methods that can testthe strength of already protected passwords.The main premise of this researchis that users dont pick their passwords randomly, but that they havea system or a behavioralpatternin which they pick their passwords. The behavior of users and malware is represented by models which represent the concrete classes of weak passwords, where the models are implemented as tools that enable us to use those models, where we have three main categories of the said models: VI1.M1 model which represents the class of attacks that are used by automated scanners or various other malicious software against systems that are available on the public internet.2.M2 model that represents the class of passwords that are tied to personal data or keywords that a user could have picked, where we use the M2 model for offline analysis. 3.M3 models which represent the behavior of the user and how the user creates a password and which elements and changes does he use when he creates his password, where we use the M3 model of offline analysis. Each model was researched and developed inthe following way: 1.For the M1 model, we developed a method for password list creation and updating from various unstructured repositories that are maintained by information security consultants or malicious hackers, which enables us to collect those listsand create a list that is based on the relative usage of each username / password pair. This enables us to automate the testing of our systems with the passwords that those groups published. This concept simplifies the evaluation of weak passwords on systems that are available on our networks. 2.For the M2 model, we developeda method that enables us to use the Google search engine to create wordlists with the usage of few keywords that we picked. a.Using the M2 method, a base word corpus with popular words was developed that can beused by the M3 model or in other tools. b.The concept of wordlist generation with the help of Wikipedia database dumpswas developed. Such an approach creates a large word corpus, with the requirement for a large amount of compute resources. This drawback resulted in the developmentof the method that uses the Google search enginefor wordlist generation. 3.For the M3 model, we developeda new method of machine learning which the author calls the sieve method, which enables us to classify passwords and developa model that describes how a user created a password.
a.The sieve method represents a new approach to classification problems and its concept is shown in the application of password classification.
b.Using the sieve algorithm on atraining list of passwords, we developed a set of rules that describe the users behavior when they pick their passwords.VII
c.Alongside the classification, we also collected the elements like words, number and symbol patterns from which users created their passwords. Those elements were used to augment the wordlist created with the help of method used in the M2 model. 4.The M3 model and all accompanying rules that were discovered in the classification process can be used to enable the usage of M3 models, a tool named unhash was developed that enabled the usage of such rulesin almost all popular password security testing tools that can use the standard input stdin. To test the performance and speed of the M3 model four concepts of rule usage were developed, which were compared to the base line implementation which is available in the tool called Johnthe Ripper . With the implementation of M1 model, it has been shown that default passwords can be detected on all devices that were available and were owned by the author. Implementing theM2 andM3 models and comparing their performance withtodays baseline implementation that is available in the tool called Johnthe Ripper, it has been shown that suggested implementation detects more weak passwords in a shorter time frame.