문제
A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?
The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).
For example, consider string S = CAGCCTA and arrays P, Q such that:
P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6
The answers to these M = 3 queries are as follows:
- The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
- The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
- The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1.
Write a function:
vector<int> solution(string &S, vector<int> &P, vector<int> &Q);
that, given a non-empty string S consisting of N characters and two non-empty arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.
Result array should be returned as a vector of integers.
For example, given the string S = CAGCCTA and arrays P, Q such that:
P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6
the function should return the values [2, 4, 1], as explained above.
Write an efficient algorithm for the following assumptions:
- N is an integer within the range [1..100,000];
- M is an integer within the range [1..50,000];
- each element of arrays P, Q is an integer within the range [0..N − 1];
- P[K] ≤ Q[K], where 0 ≤ K < M;
- string S consists only of upper-case English letters A, C, G, T.
풀이
문자열 S와 벡터 P, Q가 주어진다.
P[i]와 Q[i]를 각각 문자열의 시작과 끝 인덱스로 봤을 때, A C G T 중 가장 작은(아스키 코드에서) 알파벳을 정답 벡터로 구해야 한다. (A C G T는 각각 1 2 3 4로 구함)
문제 자체는 어렵지 않지만 P[i]와 Q[i]마다 하나씩 문자열에서 찾아보면 시간 초과가 빈번하게 발생한다.
따라서, 알파벳과 인덱스를 가지는 벡터를 하나 선언해서 알파벳에 대해 오름차순으로, 알파벳이 같다면 인덱스에 대해 오름차순으로 정렬을 하였다.
그 다음에는 해당 벡터의 인덱스를 하나씩 보면서 P[i]에서 Q[i] 사이가 나오면 그 인덱스를 정답 벡터에 집어 넣었다.
이 문제에서 막혀서 codility를 요즘 안 했었는데 20분만에 풀어버렸다 8ㅅ8
코드
#include <algorithm>
bool compare(pair<int, int> a, pair<int, int> b) {
if (a.first == b.first)
return a.second < b.second;
else
return a.first < b.first;
}
vector<int> solution(string &S, vector<int> &P, vector<int> &Q) {
vector<int> ans;
vector<pair<int, int>> v;
for (int i = 0; i < S.size(); i++) {
int ch;
if (S[i] == 'A')
ch = 1;
else if (S[i] == 'C')
ch = 2;
else if (S[i] == 'G')
ch = 3;
else
ch = 4;
v.push_back(make_pair(ch, i));
}
sort(v.begin(), v.end(), compare);
for (int i = 0; i < P.size(); i++) {
for (int j = 0; j < v.size(); j++) {
if (P[i] <= v[j].second && v[j].second <= Q[i]) {
ans.push_back(v[j].first);
break;
}
}
}
return ans;
}
'Old > Codility' 카테고리의 다른 글
Codility - Distinct // C++ (0) | 2020.02.19 |
---|---|
Codility - Count Div // C++ (0) | 2020.02.19 |
Codility - Passing Cars // C++ (0) | 2020.02.06 |
Codility - Max Counters // C++ (0) | 2020.02.06 |
Codility - Missing Integer // C++ (0) | 2020.02.06 |