QA & Engineering Blog

A Blog about Quality ยท Automation ยท Engineering

๐Ÿ  ํ™ˆ์œผ๋กœ

[Silver IV] DNA - 1969

๋ฌธ์ œ ๋งํฌ

์„ฑ๋Šฅ ์š”์•ฝ

๋ฉ”๋ชจ๋ฆฌ: 116760 KB, ์‹œ๊ฐ„: 168 ms

๋ถ„๋ฅ˜

๋ธŒ๋ฃจํŠธํฌ์Šค ์•Œ๊ณ ๋ฆฌ์ฆ˜, ๊ทธ๋ฆฌ๋”” ์•Œ๊ณ ๋ฆฌ์ฆ˜, ๊ตฌํ˜„, ๋ฌธ์ž์—ด

๋ฌธ์ œ ์„ค๋ช…

DNA๋ž€ ์–ด๋–ค ์œ ์ „๋ฌผ์งˆ์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ถ„์ž์ด๋‹ค. ์ด DNA๋Š” ์„œ๋กœ ๋‹ค๋ฅธ 4๊ฐ€์ง€์˜ ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค(Adenine, Thymine, Guanine, Cytosine). ์šฐ๋ฆฌ๋Š” ์–ด๋–ค DNA์˜ ๋ฌผ์งˆ์„ ํ‘œํ˜„ํ•  ๋•Œ, ์ด DNA๋ฅผ ์ด๋ฃจ๋Š” ๋‰ดํด๋ ˆ์˜คํ‹ฐ๋“œ์˜ ์ฒซ๊ธ€์ž๋ฅผ ๋”ฐ์„œ ํ‘œํ˜„ํ•œ๋‹ค. ๋งŒ์•ฝ์— Thymine-Adenine-Adenine-Cytosine-Thymine-Guanine-Cytosine-Cytosine-Guanine-Adenine-Thymine๋กœ ์ด๋ฃจ์–ด์ง„ DNA๊ฐ€ ์žˆ๋‹ค๊ณ  ํ•˜๋ฉด, โ€œTAACTGCCGATโ€๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Hamming Distance๋ž€ ๊ธธ์ด๊ฐ€ ๊ฐ™์€ ๋‘ DNA๊ฐ€ ์žˆ์„ ๋•Œ, ๊ฐ ์œ„์น˜์˜ ๋‰ดํด์˜คํ‹ฐ๋“œ ๋ฌธ์ž๊ฐ€ ๋‹ค๋ฅธ ๊ฒƒ์˜ ๊ฐœ์ˆ˜์ด๋‹ค. ๋งŒ์•ฝ์— โ€œAGCAT"์™€ โ€GGAAT"๋Š” ์ฒซ ๋ฒˆ์งธ ๊ธ€์ž์™€ ์„ธ ๋ฒˆ์งธ ๊ธ€์ž๊ฐ€ ๋‹ค๋ฅด๋ฏ€๋กœ Hamming Distance๋Š” 2์ด๋‹ค.

์šฐ๋ฆฌ๊ฐ€ ํ•  ์ผ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. N๊ฐœ์˜ ๊ธธ์ด M์ธ DNA s1, s2, ..., sn๊ฐ€ ์ฃผ์–ด์ ธ ์žˆ์„ ๋•Œ Hamming Distance์˜ ํ•ฉ์ด ๊ฐ€์žฅ ์ž‘์€ DNA s๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ฆ‰, s์™€ s1์˜ Hamming Distance + s์™€ s2์˜ Hamming Distance + s์™€ s3์˜ Hamming Distance ... ์˜ ํ•ฉ์ด ์ตœ์†Œ๊ฐ€ ๋œ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค.

์ž…๋ ฅ

์ฒซ ์ค„์— DNA์˜ ์ˆ˜ N๊ณผ ๋ฌธ์ž์—ด์˜ ๊ธธ์ด M์ด ์ฃผ์–ด์ง„๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‘˜์งธ ์ค„๋ถ€ํ„ฐ N+1๋ฒˆ์งธ ์ค„๊นŒ์ง€ N๊ฐœ์˜ DNA๊ฐ€ ์ฃผ์–ด์ง„๋‹ค. N์€ 1,000๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ ์ž์—ฐ์ˆ˜์ด๊ณ , M์€ 50๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ ์ž์—ฐ์ˆ˜์ด๋‹ค.

์ถœ๋ ฅ

์ฒซ์งธ ์ค„์— Hamming Distance์˜ ํ•ฉ์ด ๊ฐ€์žฅ ์ž‘์€ DNA ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ , ๋‘˜์งธ ์ค„์—๋Š” ๊ทธ Hamming Distance์˜ ํ•ฉ์„ ์ถœ๋ ฅํ•˜์‹œ์˜ค. ๊ทธ๋Ÿฌํ•œ DNA๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ์„ ๋•Œ์—๋Š” ์‚ฌ์ „์ˆœ์œผ๋กœ ๊ฐ€์žฅ ์•ž์„œ๋Š” ๊ฒƒ์„ ์ถœ๋ ฅํ•œ๋‹ค.

๐Ÿ’ก Solutions

๐Ÿ“„ DNA.py

def check(i,j):
    if dnas[i][j] == 'A' : return 0
    if dnas[i][j] == 'C' : return 1
    if dnas[i][j] == 'G' : return 2
    if dnas[i][j] == 'T' : return 3

def rev(x):
    if x == 0 : return 'A'
    if x == 1 : return 'C'
    if x == 2 : return 'G'
    if x == 3 : return 'T'

N, M = map(int,input().split())
dnas = [list(input()) for _ in range(N)]

ans = []
for j in range(M):
    acgt = [0, 0, 0, 0]
    for i in range(N):
        target = check(i,j)
        acgt[target] += 1
    x = acgt.index(max(acgt))
    ans.append(rev(x))
    
print(''.join(ans))
cnt = 0
for i in range(N):
    for j in range(M):
        if ans[j] != dnas[i][j]:
            cnt += 1
            
print(cnt)